Pandas fillna()

The fillna() method in Pandas is used to fill missing (NaN) values in a DataFrame.

Example

import pandas as pd

# create a DataFrame with missing values
data = {'A': [1, 2, None, 4, 5],
             'B': [None, 2, 3, None, 5]}

df = pd.DataFrame(data)

# fill missing values with a constant value say 0 df_filled = df.fillna(0)
print(df_filled) ''' Output A B 0 1.0 0.0 1 2.0 2.0 2 0.0 3.0 3 4.0 0.0 4 5.0 5.0 '''

fillna() Syntax

The syntax of the fillna() method in Pandas is:

df.fillna(value, method=None, axis=None, inplace=False, limit=None)

fillna() Arguments

The fillna() method takes following arguments:

  • value - specifies the value that we want to use for filling missing values
  • method (optional) - allows us to specify a method for filling missing values
  • axis (optional) - specifies the axis along which the filling should be performed
  • inplace (optional) - if set to True, it will modify the original DataFrame. If False (default), it will return a new DataFrame with missing values filled
  • limit (optional) - limits the number of replacements for forward and backward filling

fillna() Return Value

The fillna() method returns a new DataFrame with missing values filled according to the specified parameters.


Example 1: Fill Missing Values With Constant Value

import pandas as pd

# create a DataFrame with missing values
data = {'A': [10, 20, None, 25, 55],
        'B': [None, 2, 13, None, 65]}

df = pd.DataFrame(data)

constant_value = 0

# fill missing values with a constant value df_filled = df.fillna(constant_value)
print(df_filled)

Output

     A     B
0  10.0   0.0
1  20.0   2.0
2   0.0  13.0
3  25.0   0.0
4  55.0  65.0

In the above example, we have set constant_value to 0. The fillna() method replaces all missing values in the df DataFrame with this constant value.

The missing values are replaced with 0 in the resulting DataFrame.

Note: We can replace 0 with any other constant value of our choice to fill missing values with that value in our DataFrame.


Example 2: Fill Missing Values With a Dictionary

import pandas as pd

# create a DataFrame with missing values
data = {'A': [10, 20, None, 25, 55],
        'B': [None, 2, 13, None, 65]}

df = pd.DataFrame(data)

# define a dictionary with values for filling missing values
# replace 'A' column missing values with 0 and 'B' missing values with 42
fill_values = {'A': 0, 'B': 42}  

# fill missing values with the values from the dictionary df_filled = df.fillna(fill_values)
print(df_filled)

Output

      A     B
0  10.0  42.0
1  20.0   2.0
2   0.0  13.0
3  25.0  42.0
4  55.0  65.0

Here, we have replaced missing values of 'A' column with value 0 and replaced missing values of 'B' column with value a constant value 42.

When inplace=True is used, the df DataFrame is directly updated, eliminating the need for a new DataFrame to hold the changes. .


Example 3: Use Different Methods for Filling Missing Values

We can use the method parameter to specify a method for filling missing values. If we set

  • method='ffill' - it implements forward filling, where missing values are filled with the preceding non-missing value
  • method='bfill' - it implements backward filling, where missing values are filled with the next non-missing value

Let's look at an example.

import pandas as pd

data = {'A': [1, 2, None, 4, 5],
        'B': [None, 2, 3, None, 5]}

df = pd.DataFrame(data)

# forward fill missing values
df_ffill = df.fillna(method='ffill')
print(df_ffill)

# backward fill missing values
df_bfill = df.fillna(method='bfill')
print(df_bfill)

Output

    A    B
0  1.0  NaN
1  2.0  2.0
2  2.0  3.0
3  4.0  3.0
4  5.0  5.0
     A    B
0  1.0  2.0
1  2.0  2.0
2  4.0  3.0
3  4.0  5.0
4  5.0  5.0

Here, while forward filling missing values using method='ffill',

  1. For column 'A', it fills the missing value in row 2 with the previous non-missing value (2.0 from row 1).
  2. For column 'B', it fills the missing values in rows 0 and 3 with the previous non-missing values (1.0 from row 0 and 3.0 from row 2).

And, while backward filling missing values using method='bfill',

  1. For column 'A', it fills the missing value in row 2 with the next non-missing value (4.0 from row 3).
  2. For column 'B', it fills the missing values in rows 0 and 3 with the next non-missing values (2.0 from row 1 and 5.0 from row 4).

Example 4: Specify Axis Along Which Filling Should be Performed

  1. To fill missing values along rows (column-wise), we set axis=0 (or we can omit the axis parameter since axis=0 is the default behavior)
  2. To fill missing values along columns (row-wise), we can set axis=1.

Let's look at an example.

import pandas as pd

# create a DataFrame with missing values
data = {'A': [1, 2, None, 4, 5],
        'B': [None, 2, 3, None, 5]}
df = pd.DataFrame(data)

# fill missing values along rows (column-wise) df_filled_rows = df.fillna(101, axis=0)
print("Filled along rows (column-wise):\n", df_filled_rows)
# fill missing values along columns (row-wise) df_filled_columns = df.fillna(202, axis=1)
print("\nFilled along columns (row-wise):\n", df_filled_columns)

Output

Filled along rows (column-wise):
      A      B
0    1.0  101.0
1    2.0    2.0
2  101.0    3.0
3    4.0  101.0
4    5.0    5.0
Filled along columns (row-wise):
       A      B
0    1.0  202.0
1    2.0    2.0
2  202.0    3.0
3    4.0  202.0
4    5.0    5.0

In the above example, we use the fillna() method to fill missing values with 101 and 202 along rows (column-wise) and along columns (row-wise) respectively.


Example 5: Use of as_index Argument in fillna()

The as_index() argument is used to specify whether grouping columns should be treated as index columns or not.

  • as_index=True - grouped columns become the index of the resulting DataFrame
  • as_index=False - grouped columns remain as regular columns in the resulting DataFrame

Let's look at an example.

import pandas as pd

# create a sample DataFrame with missing values
data = {'A': [1, 2, None, None, 5, None],
        'B': [None, 10, 11, None, 14, 15]}
df = pd.DataFrame(data)

# fill missing values forward with a limit of 1 df_filled_forward = df.fillna(method='ffill', limit=1)
print("DataFrame filled forward:") print(df_filled_forward)
# fill missing values backward with a limit of 1 df_filled_backward = df.fillna(method='bfill', limit=1)
print("\nDataFrame filled backward:") print(df_filled_backward)

Output

DataFrame filled forward:
     A     B
0  1.0   NaN
1  2.0  10.0
2  2.0  11.0
3  NaN  11.0
4  5.0  14.0
5  5.0  15.0

DataFrame filled backward:
      A     B
0  1.0  10.0
1  2.0  10.0
2  NaN  11.0
3  5.0  14.0
4  5.0  14.0
5  NaN  15.0

In this example, the limit parameter is set to 1 for both forward and backward filling.

As a result, only a maximum of one consecutive missing value will be filled in either direction from any given position.

This allows us to control how many missing values are replaced in a consecutive sequence while leaving the rest of the missing values unchanged.

Your builder path starts here. Builders don't just know how to code, they create solutions that matter.

Escape tutorial hell and ship real projects.

Try Programiz PRO
  • Real-World Projects
  • On-Demand Learning
  • AI Mentor
  • Builder Community