Lesson 3

Exploring Data Visualization: Building Bar Plots and Histograms in Python

Introduction and Overview

Welcome to this interactive lesson on bar plots and histograms in Python! In this lesson, we will embark on a beautiful data visualization journey. We will focus on constructing bar plots and histograms using Matplotlib. Are you ready? Let's begin!

Building Bar Plots with Matplotlib

A bar plot visually represents categorical data as rectangular bars, the lengths of which are proportional to their respective values. For instance, a bar plot would be the ideal choice if we wanted to visualize a bookstore's sales data, where the categories are book names and the values are sales numbers.

We can build a bar plot using plt.bar function, which takes in two arrays of the same length: category names and values per category.

Python
1import matplotlib.pyplot as plt 2 3books = ['Book1', 'Book2', 'Book3', 'Book4', 'Book5'] # Book names 4sales = [123, 432, 567, 245, 312] # Corresponding number of copies sold 5 6plt.bar(books, sales) # Create bar plot 7plt.title('Book Sales') 8plt.xlabel('Books') 9plt.ylabel('Number of Sold Copies') 10plt.show()

The resulting plot looks like this:

Building Histograms with Matplotlib: Dataset

Now, let's move on to histograms! Unlike bar plots, histograms are designed for visualizing distributions of continuous, numeric data. In a histogram, bars represent the frequency of data points falling under specific ranges or bins. Let's say we have age data for a city's population for this example.

Python
1# Generates a data set with 150 data points, with a mean of 27 and standard deviation of 12 2ages = np.random.normal(loc=27, scale=12, size=150) 3 4#Creates 6 bins that are left inclusive, right exclusive 5#Bin 1: [0,10), Bin 2: [10,20), and so on 6bins = [0, 10, 20, 30, 40, 50, 60]
Building Histograms with Matplotlib: Plot

We'll use this data to create a histogram that visualizes the age distribution.

Python
1import matplotlib.pyplot as plt # Importing Matplotlib library 2import numpy as np 3 4ages = np.random.normal(loc=27, scale=12, size=150) 5bins = [0, 10, 20, 30, 40, 50, 60] 6 7plt.hist(ages, bins, edgecolor='black') # Create histogram 8plt.title('Ages in City X') 9plt.xlabel('Ages') 10plt.ylabel('Number of People') 11plt.show()

Here is the resulting plot:

Distinguishing Between Bar Plots and Histograms

While they may possess visual similarities, bar plots and histograms offer distinct data views. Bar plots excel when displaying categorical data, whereas histograms provide insights into numerical data distributions.

Lesson Summary and Practice Announcement

Great job navigating through the basics of making sense of data using bar plots and histograms! Now, prepare for the practical exercises designed to give you hands-on experience. Let's get to work and practice these newfound skills! Remember, practice enhances understanding! Happy learning!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.