The resample()
method in Pandas converts time series data to a different frequency.
Example
import pandas as pd
# create a time series DataFrame
index = pd.date_range('1/1/2020', periods=4, freq='T')
data = pd.Series([0.0, None, 2.0, 3.0], index=index)
df = pd.DataFrame(data, columns=['A'])
print('Original Data:')
print(df)
# resample the data by 2 minutes and sum the values
resampled_data = df.resample('2T').sum()
print()
print('Resampled Data:')
print(resampled_data)
'''
Output
Original Data:
A
2020-01-01 00:00:00 0.0
2020-01-01 00:01:00 NaN
2020-01-01 00:02:00 2.0
2020-01-01 00:03:00 3.0
Resampled Data:
A
2020-01-01 00:00:00 0.0
2020-01-01 00:02:00 5.0
'''
Here, we converted a DataFrame with a one-minute frequency into a two-minute frequency using the resample('2T')
call. Also, .sum()
aggregates the data in each bin.
resample() Syntax
The syntax of the resample()
method in Pandas is:
df.resample(rule, axis=0, closed=None, label=None, convention='start', kind=None, loffset=None, base=None, on=None, level=None)
resample() Arguments
The resample()
method in Pandas has the following arguments:
rule
: the target frequency for resamplingaxis
(optional): specifies the axis to resample onclosed
(optional): defines which side of each interval is closed -'right'
or'left'
label
(optional): decides which side of each interval is labeled -'right'
or'left'
convention
(optional): for resampling withPeriodIndex
, defines whether to use the start or end of the rulekind
(optional): chooses the index type for the resampled dataloffset
(optional): adjusts the resampled time labels by the given offsetbase
(optional): sets the offset for the resample operationon
(optional): selects a specific column for resampling in DataFramelevel
(optional): identifies a particular level of aMultiIndex
to resample.
resample() Return Value
The resample method() returns a Resampler
object, which allows for various data aggregation operations for time series data.
Example 1: Downsampling and Aggregating
Downsampling is the process of reducing the frequency of a time series dataset by aggregating data points within larger intervals.
Let's look at an example.
.
import pandas as pd
# create a time series DataFrame
range_of_dates = pd.date_range('1/1/2020', periods=5, freq='T')
df = pd.DataFrame({ 'A': [1, 2, 3, 4, 5] }, index=range_of_dates)
# resample to 3-minute intervals and compute the mean
downsampled = df.resample('3T').mean()
print(downsampled)
Output
A 2020-01-01 00:00:00 2.0 2020-01-01 00:03:00 4.5
In this example, we decreased the data frequency to every three minutes (downsampling) and used .mean()
for aggregation.
To learn more about aggregate functions, visit Pandas Aggregate Function.
Example 2: Upsampling and Filling
Upsampling is the process of increasing the frequency of a time series dataset by introducing additional data points within smaller intervals, often requiring data imputation methods such as filling or interpolation.
Let's look at an example.
import pandas as pd
# time series data
range_of_dates = pd.date_range('1/1/2020', periods=2, freq='D')
df = pd.DataFrame({ 'A': [1, 2] }, index=range_of_dates)
# resample to finer granularity and forward-fill the values
upsampled = df.resample('12H').ffill()
print(upsampled)
Output
A 2020-01-01 00:00:00 1 2020-01-01 12:00:00 1 2020-01-02 00:00:00 2
In this example, we upsampled the data from daily to 12-hourly frequency, with forward filling to handle missing values.