Lesson 2

Welcome aboard to another insightful flight of analysis! Today, we will venture across the clouded skies of data visualization using line plots. We aim to transform numerical data from our `Seaborn Flights`

dataset into these plots that can guide us through time and trends.

Now, you might wonder why visualization is needed when we already traversed the `flights`

dataset in our prior lessons? Through visualization, we can unearth underlying patterns, visualize massive volumes of data, track changes over time, and compare variables. This ability to visualize data is a widely sought-after skill in diverse fields, including data analytics, business intelligence, and data science.

Observing the number of passengers traveling each month over the years yields crucial insights: Is there a season attracting more travelers? How has the number of passengers evolved over the years? To answer these intriguing questions, let's board this data visualization expression!

`Matplotlib`

, a multi-platform data visualization library built on NumPy arrays, offers a wide range of graphical displays. It is designed for creating professional and high-quality graphics by fine-tuning every imaginable element of a graph. Here, we primarily use the `pyplot`

module for 2D plotting with `Matplotlib`

.

Enough with the chit-chat! Let's get our hands dirty with some visualization.

Python`1import matplotlib.pyplot as plt 2import seaborn as sns 3 4# Load the dataset 5flights_df = sns.load_dataset('flights')`

This simple block of code imports `Matplotlib`

's `pyplot`

module, the `Seaborn`

library, and loads the 'flights' dataset from Seaborn's readily available datasets collection using the `load_dataset()`

function. Once loaded, the data is available as a dataframe, which we'd use for our analysis.

The `Flights`

dataset provides the number of passengers for each month from 1949 to 1960. To visualize overall trends, we can render the passenger count for each month into line plots. `Line plots`

enable us to observe trends over the twelve-year timeline.

Let's unravel the first plot:

Python`1# Pivot the DataFrame to get the month as the index 2flights_df_pivot = flights_df.pivot(index="month", columns="year", values="passengers") 3 4# Plot the passenger count for each month over each year 5flights_df_pivot.plot(title='Passenger Counts (1949 - 1960)') 6plt.ylabel("Passenger Count") 7plt.show()`

In this block of code, we use the `pivot()`

function to rearrange the original dataframe to allow us to easily compare passenger counts for every month across all the years. This operation results in each month being an index, with each column representing a year and the cells holding the passenger count for a month in that year. We then plot this rearranged data as a line plot.

The resulting line plot presents lines for each year, with the x-axis showing the months and the y-axis representing the passenger counts. Each line's point corresponds to the passenger count for a particular month.

You can also take the fancy route and modify the plot's features. Let's meddle with some parameters:

`figsize`

: Adjusts the size of the plot.`grid`

: Sets the grid display.`linestyle`

: Changes the style of the line.

Python`1flights_df_pivot.plot(grid=True, figsize=(10,5), linestyle='--') 2plt.title("Passenger Counts (1949 - 1960)") 3plt.ylabel("Passenger Count") 4plt.show()`

In this block of code, we use the `grid`

, `figsize`

, and `linestyle`

parameters in the `plot()`

function to customize our plot. Setting `grid`

to True adds grid lines to the graph, making it easier to trace the trends at a glance. The `figsize`

parameter adjusts the size of the displayed plot, while the `linestyle`

parameter changes the style of the lines to dashed (`--`

).

This results in a more transparent, readable and more engaging plot.

Congratulations on your first successful graph plotting ride! You've now grasped the basic concepts of line plotting with `Matplotlib`

and understood how it effectively allows data visualization. You unraveled different trends and patterns from the passenger data and learned how to customize your line plots.

From this, you'll likely agree that visualization can be an efficient and agile mechanism to understand and analyze complex data efficiently. Visualizing the data gives us insights and helps us communicate our findings effectively.

We're all set for the practice exercises to come. These exercises will provide hands-on exposure to `line plotting`

and allow you to apply the skills you have just learned. These practices are integral in bolstering your understanding and preparing you for more intricate data visualization tasks. Ready to take off into the horizon of visualization skills?