The items()
method in Pandas is used to iterate over the columns of a DataFrame.
Example
import pandas as pd
# create a DataFrame
data = {'A': [1, 2], 'B': [4, 5]}
df = pd.DataFrame(data)
# iterate through the columns using items()
for column_name, column_data in df.items():
print(f'Column name: {column_name}')
print(f'Column data:\n{column_data}\n')
'''
Output
Column name: A
Column data:
0 1
1 2
Name: A, dtype: int64
Column name: B
Column data:
0 4
1 5
Name: B, dtype: int64
'''
items() Syntax
The syntax of the items()
method in Pandas is:
for column_name, column_data in df.items():
# do something with column_name and column_data
where,
column_name
- the name of the columncolumn_data
- the series (column) data
items() Return Value
The items()
method returns a tuple with the column name and the corresponding Series containing the column's data.
Example 1: Basic Iteration Using items()
import pandas as pd
# create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'San Francisco', 'Los Angeles']}
df = pd.DataFrame(data)
# iterate through the columns using items()
for column_name, column_data in df.items():
print(f'Column name: {column_name}')
print(f'Column data:\n{column_data}\n')
Output
Column name: Name Column data: 0 Alice 1 Bob 2 Charlie Name: Name, dtype: object Column name: Age Column data: 0 25 1 30 2 35 Name: Age, dtype: int64 Column name: City Column data: 0 New York 1 San Francisco 2 Los Angeles Name: City, dtype: object
In the above example, we created the df DataFrame with three columns: Name
, Age
, and City
.
We used the items()
method to iterate through the columns. For each iteration, it prints the column name and the corresponding Series containing the column's data.
Example 2: Iterate Through Columns and Calculate Sum
import pandas as pd
# create a DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
# iterate through the columns
for column_name, column_data in df.items():
# calculate the sum of each column
column_sum = column_data.sum()
print(f'Sum of {column_name}: {column_sum}')
Output
Sum of A: 6 Sum of B: 15 Sum of C: 24
Here, we have used the items()
method to iterate through the columns of the df DataFrame.
Inside the loop, for each column, we calculated the sum of its values using the sum()
method.
Example 3: Iterate Through Columns and Filter Columns
import pandas as pd
# create a sample DataFrame
data = {'A': [1, 2, 8],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
# filter columns with values greater than 5
for column_name, column_data in df.items():
filtered_data = column_data[column_data > 5]
print(f'Filtered data in {column_name}:\n{filtered_data}\n')
Output
Filtered data in A: 2 8 Name: A, dtype: int64 Filtered data in B: 2 6 Name: B, dtype: int64 Filtered data in C: 0 7 1 8 2 9 Name: C, dtype: int64
Here, for each column, first we iterated through the columns and then filtered and extracted the values that are greater than 5 using boolean indexing.
Example 4: Rename Each Columns
import pandas as pd
# create a DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]}
df = pd.DataFrame(data)
# rename columns by adding a prefix
for column_name, column_data in df.items():
new_column_name = f'New_{column_name}'
df.rename(columns={column_name: new_column_name}, inplace=True)
print(df)
Output
New_A New_B New_C 0 1 4 7 1 2 5 8 2 3 6 9
In the above example, we have iterated through the columns of the df DataFrame using the for
loop with the items()
method.
Inside the loop, we renamed each column by adding the prefix New_
to its original name and applied the renaming to the DataFrame in-place.