The rank()
method in Pandas is used to compute the rank of each element in the Series or DataFrame columns, such as ranking scores from highest to lowest.
Example
import pandas as pd
# sample DataFrame
data = {'Score': [78, 85, 96, 86, 90]}
df = pd.DataFrame(data)
# rank the score
print(df.rank())
'''
Output
Score
0 1.0
1 2.0
2 5.0
3 3.0
4 4.0
'''
rank() Syntax
The syntax of the rank()
method in Pandas is:
df.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)
rank() Arguments
The rank() method takes the following arguments:
axis
: specifies whether to rank rows or columnsmethod
: specifies how to handle equal valuesnumeric_only
: rank only numeric data ifTrue
na_option
: specifies how to handleNaN
ascending
: specifies whether to rank in ascending orderpct
: specifies whether to display the rank as a percentage.
rank() Return Value
The rank()
method returns a DataFrame or Series (depending on the input) with the ranks of the data.
Example 1: Basic Ranking
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
df['Rank'] = df['Score'].rank()
print(df)
Output
Score Rank 0 78 1.0 1 85 2.5 2 96 5.0 3 85 2.5 4 90 4.0
In this example, we ranked the scores using rank()
. The rank()
method handles equal values by assigning the average rank of those values.
Example 2: Ranking with Method
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank using the 'max' method for ties
df['Rank'] = df['Score'].rank(method='max')
print(df)
Output
Score Rank 0 78 1.0 1 85 3.0 2 96 5.0 3 85 3.0 4 90 4.0
In this example, we used method='max'
to handle equal values. The max
method assigns maximum possible rank to the equal values.
Example 3: Ranking in Descending Order
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank in descending order
df['Rank'] = df['Score'].rank(ascending=False)
print(df)
Output
0 78 5.0 1 85 3.5 2 96 1.0 3 85 3.5 4 90 2.0
Here, we ranked the scores in descending order with the highest score receiving the lowest rank.
Example 4: Ranking Numeric Data Only
The numeric_only
argument is used to rank only numeric columns when applied to a DataFrame.
import pandas as pd
data = {
'Score': [78, 85, 96, 85, 90],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva']
}
df = pd.DataFrame(data)
# rank all columns
print('All columns:')
print(df.rank())
print()
# rank numeric columns only
print('Numeric columns:')
print(df.rank(numeric_only=True))
Output
All columns: Score Name 0 1.0 1.0 1 2.5 2.0 2 5.0 3.0 3 2.5 4.0 4 4.0 5.0 Numeric columns: Score 0 1.0 1 2.5 2 5.0 3 2.5 4 4.0
Here, the Name
column is not ranked in the second case due to the numeric_only=True
argument.
Example 5: Handling NaN
We can use the na_option
argument to determine how NaN
values in the data are handled.
import pandas as pd
data = {'Score': [78, 85, None, 85, 90]}
df = pd.DataFrame(data)
# rank with NaN placed at the bottom
df['Rank'] = df['Score'].rank(na_option='bottom')
print(df)
Output
Score Rank 0 78.0 1.0 1 85.0 2.5 2 NaN 5.0 3 85.0 2.5 4 90.0 4.0
In this case, we assigned the highest rank to the NaN
value.
Example 6: Ranking as a Percentage
The pct
argument returns rankings as relative percentages.
import pandas as pd
data = {'Score': [78, 85, 96, 85, 90]}
df = pd.DataFrame(data)
# rank the scores as percentages
df['Rank_pct'] = df['Score'].rank(pct=True)
print(df)
Output
Score Rank_pct 0 78 0.2 1 85 0.5 2 96 1.0 3 85 0.5 4 90 0.8
Here, the rank is displayed as a relative percentage of the highest rank.