The to_datetime() method in Pandas is used to convert various types of date formats into a standardized datetime format.
Example
import pandas as pd
# create a Series with date strings in 'YYYYMMDD' format
date_series = pd.Series(['20200101', '20200201', '20200301'])
# convert string dates to datetime objects using pd.to_datetime
converted_dates = pd.to_datetime(date_series, format='%Y%m%d')
print(converted_dates)
'''
Output
0 2020-01-01
1 2020-02-01
2 2020-03-01
dtype: datetime64[ns]
'''
to_datetime() Syntax
The syntax of the to_datetime() method in Pandas is:
Pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None)
to_datetime() Arguments
The to_datetime() method takes following arguments:
arg- an object to convert to a datetimeerrors(optional) - specifies how to handle errors for unparsable datesdayfirst(optional) - ifTrue, parses dates with the day firstyearfirst(optional) - ifTrue, parses dates with the year firstutc(optional) - ifTrue, returns a UTC DatetimeIndexformat(optional) - string format to parse the dateunit(optional) - the unit of theargfor epoch times
to_datetime() Return Value
The to_datetime() method returns a datetime object.
Example 1: Convert String Dates to Datetime Objects
import pandas as pd
# create a Series with date strings in 'YYYYMMDD' format
date_series = pd.Series(['20201010', '20201111', '20201212'])
# convert string dates to datetime objects using pd.to_datetime
converted_dates = pd.to_datetime(date_series)
print(converted_dates)
Output
0 2020-10-10 1 2020-11-11 2 2020-12-12 dtype: datetime64[ns]
In the above example, we have used the pd.to_datetime() method to convert string dates into datetime objects.
The resulting datetime objects are then printed, showing the converted dates.
Example 2: Handle Date Parsing Errors with to_datetime()
import pandas as pd
# create a Series with some valid and some invalid date strings
date_series = pd.Series(['20200101', 'invalid date', '20200301', 'another invalid'])
# use 'coerce' in the errors argument to handle invalid dates
converted_dates = pd.to_datetime(date_series, format='%Y%m%d', errors='coerce')
print(converted_dates)
Output
0 2020-01-01 1 NaT 2 2020-03-01 3 NaT dtype: datetime64[ns]
In this example, date_series includes both valid dates in the YYYYMMDD format and strings that are not valid dates.
Then we used pd.to_datetime() with errors='coerce'. This ensures that instead of raising an error for the invalid dates, Pandas converts them to NaT.
The result is a Series where valid dates are correctly parsed, and invalid dates are represented as NaT.
Example 3: Use of dayfirst and yearfirst in to_datetime()
import pandas as pd
# create a Series with ambiguous date strings
date_series = pd.Series(['01-02-2020', '03-04-2021', '05-06-2022'])
# parse dates with dayfirst=True
dates_dayfirst = pd.to_datetime(date_series, dayfirst=True)
# parse dates with yearfirst=True
dates_yearfirst = pd.to_datetime(date_series, yearfirst=True)
print("Dates with dayfirst=True:\n", dates_dayfirst)
print("\nDates with yearfirst=True:\n", dates_yearfirst)
Output
Dates with dayfirst=True: 0 2020-02-01 1 2021-04-03 2 2022-06-05 dtype: datetime64[ns] Dates with yearfirst=True: 0 2020-01-02 1 2021-03-04 2 2022-05-06 dtype: datetime64[ns]
Here, with
dayfirst=True, which tells Pandas to interpret the first number as the day, and onceyearfirst=True, which tells Pandas to interpret the first number as the year
Example 4: Convert datetime to UTC (Coordinated Universal Time)
import pandas as pd
# create a Series with date strings
date_series = pd.Series(['2021-01-01 12:00:00', '2021-06-01 15:30:00', '2021-12-31 23:59:59'])
# convert string dates to UTC datetime objects using pd.to_datetime
converted_dates_utc = pd.to_datetime(date_series, utc=True)
print(converted_dates_utc)
Output
0 2021-01-01 12:00:00+00:00 1 2021-06-01 15:30:00+00:00 2 2021-12-31 23:59:59+00:00 dtype: datetime64[ns, UTC]
In this example, we used the pd.to_datetime() method to convert these string dates into datetime objects.
The utc parameter is set to True to convert the dates into UTC timezone.
Example 5: Use of unit Argument in to_datetime()
The unit argument in to_datetime() method specifies the time unit for epoch time conversions.
The common units include D (days), s (seconds), ms (milliseconds), us (microseconds), and ns (nanoseconds).
Let's look at an example.
import pandas as pd
# create a Series with epoch times (in seconds)
epoch_series = pd.Series([1609459200, 1612137600, 1614556800])
# convert the epoch times to datetime objects
# the 'unit' argument is set to 's' for seconds
converted_dates = pd.to_datetime(epoch_series, unit='s')
print(converted_dates)
Output
0 2021-01-01 1 2021-02-01 2 2021-03-01 dtype: datetime64[ns]
Here, epoch_series is the Series containing epoch times. These are Unix timestamps representing the number of seconds since January 1, 1970.
pd.to_datetime() is used with the unit argument set to s to indicate that the input numbers are in seconds.
The method converts these epoch times to standard datetime objects, which are then printed.
Example 6: to_datetime() With Custom Format
import pandas as pd
# create a dataframe with date strings in custom format
df = pd.DataFrame({'date': ['2021/22/01', '2022/13/01', '2023/30/03']})
# convert the 'date' column to datetime with custom format
df['date'] = pd.to_datetime(df['date'], format='%Y/%d/%m')
print(df)
Output
date
0 2021-01-22
1 2022-01-13
2 2023-03-30
In this example, we converted the date column from string (in YY/DD/MM format) to DateTime data type.