Convert Pandas DataFrame into Python Dictionary

Pandas DataFrames allow for faster and easier manipulation of data by storing it in a table of rows and columns. In this tutorial, we will be exploring the different ways to convert Pandas DataFrame into Python Dictionary.


How to convert Pandas DataFrame into a Dictionary:

The following table shows the layout and contents of the data stored in the Pandas DataFrame which we will be using for this tutorial.

NameSectionScore
1ElliotA93
2GilbertD78
3AliceC64

Method 1: Use a for loop

Iterate over the DataFrame and add new values to a dictionary in each iteration by passing the column label as a key and then assigning the contents of the DataFrame as a value. Note that we type casted the contents of the DataFrame to be of class list as the contents are automatically of class Series.

import pandas as pd
from pprint import pprint

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], 
                  columns=["Name", "Section", "Score"])
new_dict = {}

for i in df:
    new_dict[i] = list(df[i])                

pprint(new_dict)

We imported pprint from the class pprint as it formats the output to be more neat and organized. It is especially useful when working with Pandas.

Output:

{'Name': ['Elliot', 'Gilbert', 'Alice'],
 'Score': [93, 78, 64],
 'Section': ['A', 'D', 'C']}

However, this is not the optimal method to convert Pandas DataFrame into Python Dictionary. There are easier and more efficient ways to accomplish this which you will see below.


Method 2: Use to_dict()

The to_dict() method returns a dictionary from the DataFrame it was called on. It takes orient (orientation) as a parameter. The returned dictionary consists of key-value pairs. The orient parameter is used to specify the data type of the values stored in these pairs (e.g list, dictionary, etc.).

Example 1:

In the example below, no parameters are specified so the dictionary class is automatically passed in. This determines the inner dictionary’s key-value pairs by assigning the index as keys and the contents of the column as the values. The outer dictionary is created by assigning the column label as the key and the inner dictionary as the value.

from pprint import pprint

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], 
                  columns=["Name", "Section", "Score"])

new_dict = df.to_dict()
pprint(new_dict)

Output:

{1: {'Name': 'Elliot', 'Score': 93, 'Section': 'A'},
 2: {'Name': 'Gilbert', 'Score': 78, 'Section': 'D'},
 3: {'Name': 'Alice', 'Score': 64, 'Section': 'C'}}

Example 2:

However, if we want the values of the outer dictionary to be a list instead of a dictionary, we will pass “list” as a parameter so the key of the outer dictionary will be the column label and the value will be a list.

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], 
                  columns=["Name", "Section", "Score"])

new_dict = df.to_dict("list")
pprint(new_dict)

Output:

{'Name': ['Elliot', 'Gilbert', 'Alice'],
 'Score': [93, 78, 64],
 'Section': ['A', 'D', 'C']}

Example 3:

The above examples used the column label as the key and the contents of the column as the value. If you want to use the row index as the key and the contents of the row as the value instead, you can pass “index” as a parameter. This creates the inner dictionary by assigning the column label as the key and the content of the cell (at which the row index and the column label intersect) as the value. The outer dictionary uses the index as keys and the inner dictionary as values.

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], 
                  columns=["Name", "Section", "Score"])

new_dict = df.to_dict("index")
pprint(new_dict)

Output:

{1: {'Name': 'Elliot', 'Score': 93, 'Section': 'A'},
 2: {'Name': 'Gilbert', 'Score': 78, 'Section': 'D'},
 3: {'Name': 'Alice', 'Score': 64, 'Section': 'C'}}

Method 3: Use Dictionary Comprehension

Dictionary Comprehension can also be used to convert Pandas DataFrame into Python Dictionary. It allows for short codes but may use too much memory and reduce the speed of the program. The zip() function will be used to group tuples together so that they are arranged properly.

Example 1:

In the first example, we will display the output column by column. This is very simple as we are storing the contents of each column in a list.

import pandas as pd

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], 
                  columns=["Name", "Section", "Score"])

new_dict = {key:[x,y,z] for key,[x, y, z] in 
            zip(list(df.columns), [df["Name"], df["Section"], df["Score"]])}
print(new_dict)

Output:

{'Name': ['Elliot', 'Gilbert', 'Alice'],
 'Score': [93, 78, 64],
 'Section': ['A', 'D', 'C']}

Example 2:

Now we will create a dictionary that stores the values of the DataFrame row by row. To do this, we will create a dictionary within a dictionary. The loc[index] function will return the row stored on that index, and we will use the column and index properties to obtain the column labels and row numbers.

import pandas as pd

df = pd.DataFrame([["Elliot", "A", 93], 
                   ["Gilbert", "D", 78], 
                   ["Alice", "C", 64]], 
                  index=[1,2,3], columns=["Name", "Section", "Score"])

new_dict = {i:
            {
            key:value 
            for key, value in zip(list(df.columns), list(df.loc[i].values))
            } 
            for i in list(df.index.values)}
print(new_dict)

Output:

{1: {'Name': 'Elliot', 'Score': 93, 'Section': 'A'},
 2: {'Name': 'Gilbert', 'Score': 78, 'Section': 'D'},
 3: {'Name': 'Alice', 'Score': 64, 'Section': 'C'}}

This marks the end of the “Convert Pandas DataFrame into Python Dictionary” Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

Leave a Comment