Pandas str.replace()

The str.replace() method in Pandas is used to replace a substring within each string element of a Series with another string.

Example

import pandas as pd

# create a Series
data = pd.Series(['apple', 'banana', 'cherry'])

# use str.replace() to replace 'a' with '@' data = data.str.replace('a', '@')
print(data) ''' Output 0 @pple 1 b@n@n@ 2 cherry dtype: object '''

str.replace() Syntax

The syntax of the str.replace() method in Pandas is:

Series.str.replace(pat, repl, n=-1, case=None, regex=True)

str.replace() Arguments

The str.replace() method takes following arguments:

  • pat - string to be replaced or a regular expression pattern
  • repl - string to replace with
  • n (optional) - maximum number of replacements per string. -1 means replace all occurrences.
  • case (optional) - determines if the operation is case sensitive
  • regex (optional) - if True, assumes that pat is a regular expression. If False, treat pat as a literal string.

str.replace() Return Value

The str.replace() method returns a Series of the same size containing the replaced strings.


Example 1: Replace Substring Using str.replace()

import pandas as pd

# create a Series 
cities = pd.Series(['San Jose', 'Los Angeles', 'San Francisco'])

# use str.replace() to replace 'San' with 'Santa' cities = cities.str.replace('San', 'Santa')
print(cities)

Output

0     Santa Jose
1     Los Angeles
2     Santa Francisco
dtype: object

In the above example, we have used the str.replace('San', 'Santa') method to replace the substring San with Santa in each string of the cities Series.


Example 2: Limit Replacements with n Parameter in str.replace()

import pandas as pd

# create a Series
data = pd.Series(['apple', 'banana', 'cherry'])

# replace only the first occurrence of 'a' replace_1 = data.str.replace('a', '@', n=1)
# replace the first two occurrences of 'a' replace_2 = data.str.replace('a', '@', n=2)
# no replacement will be made with n=0 replace_3 = data.str.replace('a', '@', n=0)
print("Original Series:") print(data) print("\nReplace first occurrence:\n") print(replace_1) print("\nReplace first two occurrences:\n") print(replace_2) print("\nNo replacement (n=0):") print(replace_3)

Output

Original Series:
0     apple
1    banana
2    cherry
dtype: object

Replace first occurrence:

0     @pple
1    b@nana
2    cherry
dtype: object

Replace first two occurrences:

0     @pple
1    b@n@na
2    cherry
dtype: object

No replacement (n=0):
0     apple
1    banana
2    cherry
dtype: object

In this example,

  1. str.replace('a', '@', n=1) - replaces only the first occurrence of a
  2. str.replace('a', '@', n=2) - replaces the first two occurrences
  3. str.replace('a', '@', n=0) - no replacement occurs since n=0

Example 3: Case Sensitivity in String Replacement

import pandas as pd

# create a Series
data = pd.Series(['Apple', 'Banana', 'Cherry'])

# case sensitive replacement (default behavior) case_sensitive_replace = data.str.replace('a', '@')
# case insensitive replacement case_insensitive_replace = data.str.replace('a', '@', case=False)
print("\nCase Sensitive Replacement:") print(case_sensitive_replace) print("\nCase Insensitive Replacement:") print(case_insensitive_replace)

Output

Case Sensitive Replacement:
0     Apple
1    B@n@n@
2    Cherry
dtype: object

Case Insensitive Replacement:
0     @pple
1    B@n@n@
2    Cherry
dtype: object

Here,

  • str.replace('a', '@') - the default case-sensitive behavior, where only lowercase a is replaced with @.
  • str.replace('a', '@', case=False) - shows the case-insensitive replacement, where both A and a are replaced with @.

Example 4: Use of regex argument in str.replace()

import pandas as pd

# create a Series 
products = pd.Series(['T-shirt 12', 'Jeans 30', 'Hat', 'Dress 8', 'Shoes 42'])

# use str.replace() with regex to replace numeric sizes products_replaced = products.str.replace(r'\d+', 'SIZE', regex=True)
print(products_replaced)

Output

0    T-shirt SIZE
1      Jeans SIZE
2           Hat
3      Dress SIZE
4      Shoes SIZE
dtype: object

In the above example, the pattern r'\d+' matches sequences of digits in the product names.

By setting regex=True, we enabled the use of regular expressions in the str.replace() method.

Hence the method replaces these sequences with the string SIZE.

Note: To learn more about Regular Expressions, please visit Python RegEx.