The rank() method in Pandas is used to compute the rank of each element in the Series or DataFrame columns, such as ranking scores from highest to lowest.
Example
import pandas as pd
# sample DataFrame
data = {'Score': [78, 85, 96, 86, 90]}
df = pd.DataFrame(data)
# rank the score
print(df.rank())
'''
Output
Score
0 1.0
1 2.0
2 5.0
3 3.0
4 4.0
'''
rank() Syntax
The syntax of the rank() method in Pandas is:
df.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)
rank() Arguments
The rank() method takes the following arguments:
axis: specifies whether to rank rows or columnsmethod: specifies how to handle equal valuesnumeric_only: rank only numeric data ifTruena_option: specifies how to handleNaNascending: specifies whether to rank in ascending orderpct: specifies whether to display the rank as a percentage.
rank() Return Value
The rank() method returns a DataFrame or Series (depending on the input) with the ranks of the data.
Example 1: Basic Ranking
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
df['Rank'] = df['Score'].rank()
print(df)
Output
Score Rank 0 78 1.0 1 85 2.5 2 96 5.0 3 85 2.5 4 90 4.0
In this example, we ranked the scores using rank(). The rank() method handles equal values by assigning the average rank of those values.
Example 2: Ranking with Method
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank using the 'max' method for ties
df['Rank'] = df['Score'].rank(method='max')
print(df)
Output
Score Rank 0 78 1.0 1 85 3.0 2 96 5.0 3 85 3.0 4 90 4.0
In this example, we used method='max' to handle equal values. The max method assigns maximum possible rank to the equal values.
Example 3: Ranking in Descending Order
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank in descending order
df['Rank'] = df['Score'].rank(ascending=False)
print(df)
Output
0 78 5.0 1 85 3.5 2 96 1.0 3 85 3.5 4 90 2.0
Here, we ranked the scores in descending order with the highest score receiving the lowest rank.
Example 4: Ranking Numeric Data Only
The numeric_only argument is used to rank only numeric columns when applied to a DataFrame.
import pandas as pd
data = {
'Score': [78, 85, 96, 85, 90],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva']
}
df = pd.DataFrame(data)
# rank all columns
print('All columns:')
print(df.rank())
print()
# rank numeric columns only
print('Numeric columns:')
print(df.rank(numeric_only=True))
Output
All columns: Score Name 0 1.0 1.0 1 2.5 2.0 2 5.0 3.0 3 2.5 4.0 4 4.0 5.0 Numeric columns: Score 0 1.0 1 2.5 2 5.0 3 2.5 4 4.0
Here, the Name column is not ranked in the second case due to the numeric_only=True argument.
Example 5: Handling NaN
We can use the na_option argument to determine how NaN values in the data are handled.
import pandas as pd
data = {'Score': [78, 85, None, 85, 90]}
df = pd.DataFrame(data)
# rank with NaN placed at the bottom
df['Rank'] = df['Score'].rank(na_option='bottom')
print(df)
Output
Score Rank 0 78.0 1.0 1 85.0 2.5 2 NaN 5.0 3 85.0 2.5 4 90.0 4.0
In this case, we assigned the highest rank to the NaN value.
Example 6: Ranking as a Percentage
The pct argument returns rankings as relative percentages.
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank the scores as percentages
df['Rank_pct'] = df['Score'].rank(pct=True)
print(df)
Output
Score Rank_pct 0 78 0.2 1 85 0.5 2 96 1.0 3 85 0.5 4 90 0.8
Here, the rank is displayed as a relative percentage of the highest rank.