The mask()
method in Pandas is used to replace values where certain conditions are met.
Example
import pandas as pd
# create a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
})
# replace values in column 'A' that are greater than 2 with -1
df['A'] = df['A'].mask(df['A'] > 2, -1)
print(df)
'''
Output
A B
0 1 5
1 2 6
2 -1 7
3 -1 8
'''
mask() Syntax
The syntax of the mask()
method in Pandas is:
df.mask(cond, other=nan, inplace=False, axis=None, level=None)
mask() Arguments
The mask()
method takes following arguments:
cond
- condition to checkother
(optional) - values to replace with where the condition isTrue
inplace
(optional) - modifies the caller object directly without creating a new objectaxis
(optional) - which axis to align the other values with, if necessarylevel
(optional) - if the DataFrame has a MultiIndex, this determines which level to align with
mask() Return Value
The mask()
method returns a new DataFrame with the same shape as the original, where values specified by the condition are replaced.
Example 1: Replace Values Using mask()
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
})
# use mask() to replace even values
# across the entire DataFrame with 0
df = df.mask(df % 2 == 0, 0)
print(df)
Output
A B 0 1 5 1 0 0 2 3 7 3 0 0
In the above example, the mask()
method replaces all the even numbers with 0.
Example 2: Customizing Value Replacement With other Argument in mask()
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
})
# replace values in 'A' greater than 2 with their double
df['A'] = df['A'].mask(df['A'] > 2, other=lambda x: x * 2)
print(df)
Output
A B 0 1 5 1 2 6 2 6 7 3 8 8
In this example, we have applied the mask()
method to the A
column of the df DataFrame.
It first checks for values in A
that are greater than 2 using df['A'] > 2
. For those values that meet this condition, it replaces them with their double.
The doubling is achieved using the other
argument, which is set to a lambda function lambda x: x * 2
. The lambda function takes each value in the A
column and doubles it.
Example 3: Aligning Conditions With axis Argument in mask()
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8],
'C': [9, 10, 11, 12]
})
condition_series = pd.Series([True, False, True, False])
# using mask with a Series condition and axis=0
df.mask(condition_series, -1, axis=0, inplace=True)
print(df)
Output
A B C 0 -1 -1 -1 1 2 6 10 2 -1 -1 -1 3 4 8 12
Here, mask()
is applied across rows using the axis=0
argument.
For each row, if the corresponding value in the condition_series is True
, all values in that row in the df DataFrame are replaced with -1.
Here, with inplace=True
, the original DataFrame df is modified directly, and there's no need to assign the result to a new variable.
Example 4: Applying Conditional Replacements in MultiIndex DataFrame Using mask()
import pandas as pd
# create a MultiIndex DataFrame
arrays = [
['A', 'A', 'B', 'B'],
[1, 2, 1, 2]
]
index = pd.MultiIndex.from_arrays(arrays, names=['letters', 'numbers'])
df = pd.DataFrame({'data': [10, 20, 30, 40]}, index=index)
print("Original DataFrame:")
print(df)
print()
# apply mask to replace values in 'data' column
# where the 'numbers' level is 1 with 99
df['data'] = df['data'].mask(df.index.get_level_values('numbers') == 1, 99, level='numbers')
print("DataFrame after mask:")
print(df)
Output
Original DataFrame:
data
letters numbers
A 1 10
2 20
B 1 30
2 40
DataFrame after mask:
data
letters numbers
A 1 99
2 20
B 1 99
2 40
In this example, we used the MultiIndex Dataframe df and used the mask()
method with the level
argument to replace the values in the data
column where the numbers
level is 1 with 99.