Lesson 2

Working with Dates and Times

Lesson Introduction

Working with dates and times is crucial in data analysis. Imagine analyzing sales data over time to understand seasonal trends. To make sense of such data, you need to handle dates and times accurately.

Today's goals:

  1. Convert columns with date info to datetime format, even if they are in different formats.
  2. Extract specific components like the year from datetime data.
  3. Perform basic datetime operations such as finding time differences and obtaining today's date.

By the end, you'll be comfortable manipulating dates and times in Pandas. Let's start!

Converting Columns to Datetime

Date info often comes as text, which isn't very useful for analysis. Converting this text to datetime format lets us use powerful features in Pandas.

The pd.to_datetime() function converts different date formats correctly. Here's an example:

Python
1import pandas as pd 2 3# Sample data 4data = { 5 'order_date': ['2023-10-01', '10/02/2023', 'October 3 2023', '2023.10.04'] 6} 7sales = pd.DataFrame(data) 8 9# Convert 'order_date' to datetime 10sales['order_date'] = pd.to_datetime(sales['order_date'], format='mixed') 11 12print(sales)

Output:

1 order_date 20 2023-10-01 31 2023-10-02 42 2023-10-03 53 2023-10-04

This example converts various date formats into datetime objects, making date operations easier. Note that you need to specify format='mixed', so format will be inferred for each element individually

Extracting Components from Datetime

With a column in datetime format, we can extract components like the year, month, or day using the .dt accessor. Here’s how to extract the year, month, and day:

Python
1# Extract year, month, and day from datetime 2sales['year'] = sales['order_date'].dt.year 3sales['month'] = sales['order_date'].dt.month 4sales['day'] = sales['order_date'].dt.day 5 6print(sales)

Output:

1 order_date year month day 20 2023-10-01 2023 10 1 31 2023-10-02 2023 10 2 42 2023-10-03 2023 10 3 53 2023-10-04 2023 10 4

This code creates new columns for the year, month, and day, which can be useful for time-based analyses like finding monthly or seasonal trends.

Basic Datetime Operations

Pandas also allows for various datetime operations. For example, finding the time difference between two dates and obtaining today's date:

Python
1from datetime import datetime 2 3# Calculate time delta 4sales['time_since_order'] = datetime.now() - sales['order_date'] 5 6# Today's date 7today = pd.to_datetime('today') 8 9print(sales) 10print('Today\'s date:', today)

Output:

1 order_date year month day time_since_order 20 2023-10-01 2023 10 1 3 days 10:23:30.456789 31 2023-10-02 2023 10 2 2 days 10:23:30.456789 42 2023-10-03 2023 10 3 1 day 10:23:30.456789 53 2023-10-04 2023 10 4 0 days 10:23:30.456789 6Today's date: 2023-10-05

This code calculates the time difference between each order date and the current date, as well as retrieves today's date.

Lesson Summary

Today, we learned:

  • Converting date columns to datetime format using pd.to_datetime(), even for multiple formats.
  • Extracting components like the year using the .dt accessor.
  • Performing basic datetime operations such as finding time differences and obtaining today's date.

Understanding datetime manipulation is essential for efficient data analysis, enabling easy time-based computations.

Now it's time to apply your new skills. In the practice session, you’ll convert columns, extract date components, and explore more datetime features. Dive into the hands-on practice to reinforce today's knowledge!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.