The quantile() method in Pandas returns values at the given quantile over the requested axis.
A quantile is a way to understand the distribution of data within a DataFrame or Series.
Example
import pandas as pd
# sample DataFrame
data = {'A': [1, 2, 3],
'B': [4, 5, 6]}
df = pd.DataFrame(data)
# calculate the median, which is the 50th percentile or quantile(0.5)
median = df.quantile(0.5)
print(median)
'''
Output
A 2.0
B 5.0
Name: 0.5, dtype: float64
'''
quantile() Syntax
The syntax of the quantile() method in Pandas is:
df.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')
quantile() Arguments
The quantile() method has the following arguments.
q(optional): the quantile to compute, which must be between 0 and 1 (default 0.5)axis(optional): the axis to compute the quantile alongnumeric_only(optional): ifFalse, the quantile of datetime and timedelta data will be computed as well (defaultTrue)interpolation(optional): specifies the interpolation method to use when the desired quantile lies between two data points.
quantile() Return Value
The quantile() method returns a scalar or Series if q is a single quantile, and a DataFrame if q is an array of multiple quantiles.
Example 1: Single Quantile
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the 25th percentile
quantile_25 = df.quantile(0.25)
print(quantile_25)
Output
A 2.5 B 3.5 Name: 0.25, dtype: float64
Here, we calculated the 25th percentile (first quartile) for each column.
Example 2: Multiple Quantiles
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the 25th and 75th percentiles
quantiles = df.quantile([0.25, 0.75])
print(quantiles)
Output
A B
0.25 2.5 3.5
0.75 5.5 6.5
In this example, we calculated multiple quantiles for each column, resulting in a DataFrame showing the 25th and 75th percentiles.
Example 3: Quantile with Interpolation
import pandas as pd
data = {'A': [1, 3, 5, 7],
'B': [2, 4, 6, 8]}
df = pd.DataFrame(data)
# calculate the median with a different interpolation method
median_higher = df.quantile(0.5, interpolation='higher')
print(median_higher)
Output
A 5 B 6 Name: 0.5, dtype: int64
In this example, we have set the interpolation parameter to 'higher'.
By choosing 'higher', we force the quantile function to return the actual observed value from the dataset that is higher than the median position.