Lesson 6

Welcome! Today, we'll explore **scatter plots** and their creation with **Seaborn**, a Python library built on Matplotlib. We will master the construction, customization, and interpretation of scatter plots. Let's get started!

A scatter plot is a data visualization tool that represents two variables from a dataset as points on a Cartesian graph. Scatter plots are utilized in exploring correlations between variables.

Meet Seaborn, a Python library designed to create beautiful statistical graphics. It facilitates quick and easy creation of colorful and informative visuals from complex datasets.

We use the `scatterplot()`

function to create a scatter plot in Seaborn. We provide it with our data and the names of the columns to search for x and y values. Let's illustrate this concept using a small dataset, which is created this way:

Python`1import seaborn as sns 2import matplotlib.pyplot as plt 3import pandas as pd 4 5# Create a dataset 6df = pd.DataFrame({ 7 "hours": [1, 8, 2, 6, 6, 4, 4, 9, 8, 10], 8 "scores": [30, 70, 35, 90, 95, 70, 50, 100, 85, 97] 9})`

This dummy dataset maps the number of study hours for ten students to their respective test scores.

Now, let's plot this data using `scatterplot()`

.

Python`1# Create scatter plot 2sns.scatterplot(x='hours', y='scores', data=df) 3plt.show()`

This scatter plot shows a clear positive correlation: as the number of study hours increases, the test scores also increase. **Note** that we use the `plt.show`

function from `matplotlib`

to show `seaborn`

's plots.

Seaborn allows for extensive plot customization. Let's add a title and labels to our axes to make our plot more understandable.

Python`1# Customize scatter plot 2sns.scatterplot(x='hours', y='scores', data=df) 3plt.title('Study Hours vs. Test Scores') 4plt.xlabel('Hours Studied') 5plt.ylabel('Test Scores') 6plt.show()`

Now, complete with a title and labels, our plot is much more straightforward and informative.

Scatter plots visualize relationships between variables. In our plot, the test scores improve as the number of hours studied increases, indicating a positive correlation. Scatter plots are instrumental in revealing patterns in raw data, making them extremely helpful visualization tools.

Bravo! We've navigated the basics of scatter plots and explored how to use Seaborn for *plotting*, *customization*, and *interpretation*. Enjoy the practice exercises designed to cement your recently acquired skills further. We're excited to embark on this journey into practicing scatter plots! Ready, set, plot!