Link Search Menu Expand Document (external link)

How to convert a text column into dates

Description

When loading data, many software systems make intelligent guesses about the format and data type of each column, but sometimes that is not sufficient. If you have a column of text that should be interpreted as dates, how can we ask the software to convert it?

Using pandas, in Python

View this solution alone.

Let’s create a small example DataFrame to use here (using the method from how to create a data frame from scratch). Naturally, you would apply this solution to your own data instead.

1
2
3
4
import pandas as pd
df = pd.DataFrame( { 'Date' :  [ '5/7/19', '5/10/19',   '5/11/19' ],
                     'Event' : [   'Work',   'Party', 'More work' ] } )
df
Date Event
0 5/7/19 Work
1 5/10/19 Party
2 5/11/19 More work

If you’ve already got the data in a DataFrame column, and you wish to convert it to dates, use the pd.to_datetime function, which will do its best to read whatever format your dates are in:

1
2
df['Date'] = pd.to_datetime( df['Date'] )
df
1
2
/tmp/ipykernel_4832/1311655736.py:1: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  df['Date'] = pd.to_datetime( df['Date'] )
Date Event
0 2019-05-07 Work
1 2019-05-10 Party
2 2019-05-11 More work

But if they aren’t in a standard format, you can specify just about any format as in the following example. See the Python documentation for format details.

1
2
# If the dates had been, for example, 5-7-2019 10:15:00
df['Date'] = pd.to_datetime( df['Date'], format="%m-%d-%Y %H:%M:%S" )

It’s often easier to handle date conversions while reading the data. You can tell pandas to read dates in most of the common date formats using any of the following methods.

1
2
3
4
5
6
7
8
9
10
# Any columns that look like dates, treat as dates:
df = pd.read_csv( "example.csv", parse_dates=True )

# Convert the specific columns you name into dates:
df = pd.read_csv( "example.csv", parse_dates=['col1','col2'] )

# If the date is spread over multiple columns, do this:
# (Let's say the year, month, and day are in columns, 4, 5, and 6.)
df = pd.read_csv( "example.csv", parse_dates=[[4,5,6]] )
# Note the double brackets, and indices start counting at zero.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

Let’s create a small example DataFrame to use here (using the method from how to create a data frame from scratch). Naturally, you would apply this solution to your own data instead.

1
2
3
4
5
df <- data.frame(
    Date  = c( '5/7/19', '5/10/19',   '5/11/19' ),
    Event = c(   'Work',   'Party', 'More work' )
)
df
1
2
3
4
  Date    Event    
1 5/7/19  Work     
2 5/10/19 Party    
3 5/11/19 More work

We use the as.Date() function to convert a text column into dates. If the input dates are not in the standard format, we can use the format= argument to change the format. Note the difference between %y and %Y: The %y code means a 2-digit year, but the %Y code means a 4-digit year.

1
2
df$Date = as.Date( df$Date, format='%m/%d/%y' )
df
1
2
3
4
  Date       Event    
1 2019-05-07 Work     
2 2019-05-10 Party    
3 2019-05-11 More work

It’s often easier to handle date conversions while reading the data file. You can use the read_csv() function in the readr package, which will automatically recognize dates in some common formats.

Additionaly, you can use the anytime() function in the anytime package to automatically parse strings as dates regardless of the format.

1
2
3
4
# Use anytime() to attempt to parse various formats:
library(anytime)
examples <- c( "Nov 01 2022", "2022-11-01", "22-11-01" )
anytime( examples )
1
[1] "2022-11-01 UTC" "2022-11-01 UTC" NA              

Note that it succeeded in two cases, but not the third.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

  • Excel
  • Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.