Pandas resample()

The resample() method in Pandas converts time series data to a different frequency.

Example

import pandas as pd

# create a time series DataFrame
index = pd.date_range('1/1/2020', periods=4, freq='T')
data = pd.Series([0.0, None, 2.0, 3.0], index=index)
df = pd.DataFrame(data, columns=['A'])

print('Original Data:')
print(df)

# resample the data by 2 minutes and sum the values resampled_data = df.resample('2T').sum()
print() print('Resampled Data:') print(resampled_data) ''' Output Original Data: A 2020-01-01 00:00:00 0.0 2020-01-01 00:01:00 NaN 2020-01-01 00:02:00 2.0 2020-01-01 00:03:00 3.0 Resampled Data: A 2020-01-01 00:00:00 0.0 2020-01-01 00:02:00 5.0 '''

Here, we converted a DataFrame with a one-minute frequency into a two-minute frequency using the resample('2T') call. Also, .sum() aggregates the data in each bin.


resample() Syntax

The syntax of the resample() method in Pandas is:

df.resample(rule, axis=0, closed=None, label=None, convention='start', kind=None, loffset=None, base=None, on=None, level=None)

resample() Arguments

The resample() method in Pandas has the following arguments:

  • rule: the target frequency for resampling
  • axis (optional): specifies the axis to resample on
  • closed (optional): defines which side of each interval is closed - 'right' or 'left'
  • label (optional): decides which side of each interval is labeled - 'right' or 'left'
  • convention (optional): for resampling with PeriodIndex, defines whether to use the start or end of the rule
  • kind (optional): chooses the index type for the resampled data
  • loffset (optional): adjusts the resampled time labels by the given offset
  • base (optional): sets the offset for the resample operation
  • on (optional): selects a specific column for resampling in DataFrame
  • level (optional): identifies a particular level of a MultiIndex to resample.

resample() Return Value

The resample method() returns a Resampler object, which allows for various data aggregation operations for time series data.


Example 1: Downsampling and Aggregating

Downsampling is the process of reducing the frequency of a time series dataset by aggregating data points within larger intervals.

Let's look at an example.

.

import pandas as pd

# create a time series DataFrame
range_of_dates = pd.date_range('1/1/2020', periods=5, freq='T')
df = pd.DataFrame({ 'A': [1, 2, 3, 4, 5] }, index=range_of_dates)

# resample to 3-minute intervals and compute the mean
downsampled = df.resample('3T').mean()

print(downsampled)

Output

A
2020-01-01 00:00:00  2.0
2020-01-01 00:03:00  4.5

In this example, we decreased the data frequency to every three minutes (downsampling) and used .mean() for aggregation.

To learn more about aggregate functions, visit Pandas Aggregate Function.


Example 2: Upsampling and Filling

Upsampling is the process of increasing the frequency of a time series dataset by introducing additional data points within smaller intervals, often requiring data imputation methods such as filling or interpolation.

Let's look at an example.

import pandas as pd

# time series data
range_of_dates = pd.date_range('1/1/2020', periods=2, freq='D')
df = pd.DataFrame({ 'A': [1, 2] }, index=range_of_dates)

# resample to finer granularity and forward-fill the values
upsampled = df.resample('12H').ffill()

print(upsampled)

Output

A
2020-01-01 00:00:00  1
2020-01-01 12:00:00  1
2020-01-02 00:00:00  2

In this example, we upsampled the data from daily to 12-hourly frequency, with forward filling to handle missing values.

Your builder path starts here. Builders don't just know how to code, they create solutions that matter.

Escape tutorial hell and ship real projects.

Try Programiz PRO
  • Real-World Projects
  • On-Demand Learning
  • AI Mentor
  • Builder Community