The prod()
method in Pandas is used to calculate the product of the values over the requested axis.
Example
import pandas as pd
# create a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
# calculate the product of each column
column_product = df.prod()
print(column_product)
'''
Output
A 6
B 120
dtype: int64
'''
prod() Syntax
The syntax of the prod()
method in Pandas is:
df.prod(axis=None, skipna=True, numeric_only=None, min_count=0)
prod() Arguments
The prod()
method takes following arguments:
axis
(optional) - specifies axis along which the product will be computedskipna
(optional) - determines whether to include or exclude missing valuesnumeric_only
(optional) - specifies whether to include only numeric columns in the computation or notmin_count
(optional) - required number of valid values to perform the operation
prod() Return Value
The prod()
method returns the product of the values along the specified axis.
Example 1: Compute prod() Along Different Axis
import pandas as pd
# create a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
})
# calculate the product of each column ( by default axis =0)
column_product = df.prod()
# calculate the product of each row
row_product = df.prod(axis=1)
print("Product of each column:")
print(column_product)
print("\nProduct of each row:")
print(row_product)
Output
Product of each column: A 6 B 120 C 504 dtype: int64 Product of each row: 0 28 1 80 2 162 dtype: int64
In the above example,
column_product = df.prod()
- calculates the product of values in each column of the df DataFrame. Defaultaxis=0
means it operates column-wise.row_product = df.prod(axis=1)
- calculates the product of values in each row of df by settingaxis=1
, meaning it operates row-wise.
Note: We can also pass axis=0
inside prod()
to compute the product of each column.
Example 2: Calculate Product of a Specific Column
import pandas as pd
# create a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
})
# calculate the product of column 'A'
product_A = df['A'].prod()
# calculate the product of column 'B'
product_B = df['B'].prod()
print("Product of column A:", product_A)
print("Product of column B:", product_B)
Output
Product of column A: 6 Product of column B: 120
In this example, df['A']
selects column A
of the df DataFrame, and prod()
calculates the product of its values. The same is done for column B
.
Example 3: Use of numeric_only Argument in prod()
import pandas as pd
# create a DataFrame with numeric and non-numeric types
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': ['a', 'b', 'c'] # non-numeric column
})
# calculate the product of each column, excluding non-numeric data
product_numeric_only = df.prod(numeric_only=True)
print(product_numeric_only)
print()
# calculate the product of each column, trying to include all columns
try:
product_all = df.prod(numeric_only=False)
except TypeError as e:
print("Error:", e)
Output
A 6 B 120 dtype: int64 ERROR! Error: can't multiply sequence by non-int of type 'str'
Here,
- When using
numeric_only=True
, the product is calculated only for columnsA
andB
, and columnC
is excluded because it contains string data. - When using
numeric_only=False
, aTypeError
is raised because columnC
contains string and we cannot perform product calculation. Hence, error is caught and printed.
Example 4: Effect of skipna Argument on Calculating Product
import pandas as pd
# create a DataFrame with NaN values
df = pd.DataFrame({
'A': [1, None, 3],
'B': [4, 5, None],
'C': [7, 8, 9]
})
# calculate the product of each column, ignoring NaN values
product_skipna_true = df.prod()
# calculate the product of each column, including NaN values
product_skipna_false = df.prod(skipna=False)
print("Product with skipna=True (default):")
print(product_skipna_true)
print("\nProduct with skipna=False:")
print(product_skipna_false)
Output
Product with skipna=True (default): A 3.0 B 20.0 C 504.0 dtype: float64 Product with skipna=False: A NaN B NaN C 504.0 dtype: float64
In this example,
- With
skipna=True
- products of columnsA
,B
, andC
are 3, 20, and 504, respectively, ignoringNone
values. - With
skipna=False
- products of columnsA
andB
areNaN
due toNone
values, whileC
is 504.
Example 5: Calculate Products With Minimum Value Counts
import pandas as pd
# create a DataFrame with some missing values
df = pd.DataFrame({
'A': [1, None, 3],
'B': [4, 5, None],
'C': [None, None, 9]
})
# calculate the product of each column with min_count set to 1
product_min_count_1 = df.prod(min_count=1)
# calculate the product of each column with min_count set to 2
product_min_count_2 = df.prod(min_count=2)
# calculate the product of each column with min_count set to 3
product_min_count_3 = df.prod(min_count=3)
print("Product with min_count=1:\n", product_min_count_1)
print("\nProduct with min_count=2:\n", product_min_count_2)
print("\nProduct with min_count=3:\n", product_min_count_3)
Output
Product with min_count=1: A 3.0 B 20.0 C 9.0 dtype: float64 Product with min_count=2: A 3.0 B 20.0 C NaN dtype: float64 Product with min_count=3: A NaN B NaN C NaN dtype: float64
Here,
- When
min_count=1
, the product will be calculated if there is at least one non-missing value in the column. Here, all columns meet this criterion. - When
min_count=2
, the product will be calculated if there are at least two non-missing values in the column. - When
min_count=3
, the product will be calculated if there are at least three non-NA values in the column. None of the columns meet this criterion, so all results should beNaN
.