The filter()
method in Pandas is used to subset data based on a specific condition or criteria.
Example
import pandas as pd
# create a Series
data = pd.Series([1, 2, 3, 4, 5], index=['a', 'b', 'c', 'd', 'e'])
# use filter() to select elements
filtered_data = data.filter(items=['b', 'd'])
print(filtered_data)
'''
Output
b 2
d 4
dtype: int64
'''
filter() Syntax
The syntax of the filter()
method in Pandas is:
Series.filter(items=None, like=None, regex=None)
filter() Arguments
The filter()
method takes following arguments:
items
(optional) - a list containing the labels of the indices you want to keep in the Serieslike
(optional) - a string that represents a substring to match in the index labelsregex
(optional) - a regular expression pattern
filter() Return Value
The filter()
method returns the selected elements based on specified conditions, such as index names, substrings within the index labels, or regular expression patterns.
Example 1: Use items Parameter to Select Specific Indices
import pandas as pd
# create a Series
data = pd.Series([10, 20, 30, 40, 50], index=['apple', 'banana', 'carrot', 'date', 'elderberry'])
# use items parameter to select specific indices
filtered_by_items = data.filter(items=['banana', 'date'])
print("Filtered by items:\n", filtered_by_items)
Output
Filtered by items: banana 20 date 40 dtype: int64
In the above example, we first created the data Series with values [10, 20, 30, 40, 50]
and corresponding indices ['apple', 'banana', 'carrot', 'date', 'elderberry']
.
Then, we used the filter()
method with the items
parameter to select elements from the data Series that have the indices banana
and date
.
Example 2: Use like Parameter to Select Indices That Contain a Substring
import pandas as pd
# create a Series
data = pd.Series([10, 20, 30, 40, 50], index=['apple', 'banana', 'carrot', 'date', 'elderberry'])
# use like parameter to select indices that contain a substring
filtered_by_like = data.filter(like='e')
print("\nFiltered by like:\n", filtered_by_like)
Output
Filtered by like: apple 10 date 40 elderberry 50 dtype: int64
In this example, we used the filter()
method with the like
parameter to select indices in the Series that contain the letter e
.
Example 3: Select Indices Using Regular Expression Pattern
import pandas as pd
# create a Series
data = pd.Series([10, 20, 30, 40, 50], index=['apple', 'banana', 'carrot', 'date', 'elderberry'])
# use regex parameter to select indices based on a regular expression pattern
filtered_by_regex = data.filter(regex=r'^[a-d]')
print("Filtered by regex:\n", filtered_by_regex)
Output
Filtered by regex: apple 10 banana 20 carrot 30 date 40 dtype: int64
Here, we used the filter()
function with the regex
parameter set to r'^[a-d]'
.
As a result, filtered_by_regex will contain only elements with index labels starting from a
to d
.
Note: To learn more about Regular Expressions, please visit Python RegEx.