Welcome to our class on line plots! Line plots illustrate how a variable changes over time by connecting points on a graph, a process similar to forming a constellation by connecting the stars. Our objective today is to demonstrate how to create meaningful patterns from data using the R
programming language. By the end of this course, you will be capable of crafting stories using line plots, understanding the necessary data preparation, and customizing your plot to use ggplot2, a powerful plotting library in R.
A line plot can be likened to a method of tracking your progress in a video game over a week. Each day, you mark down your score and connect the scores with a line, thereby creating a plot. This subsequent plot serves as a visual representation of the rise and fall of your scores — a depiction of your journey through the game's levels.
Attributes of a well-crafted line plot include:
- An X-axis (a horizontal line) representing your timeline.
- A Y-axis (a vertical line) that indicates what you're measuring, such as your game score.
- Data points that register each day's score.
- Lines that connect these data points, forming a visual 'route' throughout your gaming week.
Line plots excel at identifying trends at a glance and find utility in numerous scenarios, spanning from finance to fitness tracking.
Before plotting, it's imperative to have our data ready — akin to gathering ingredients before baking cookies. We need matched pairs of time and the characteristic that we're tracking (similar to different cookie types and their quantities). If we were to plot the growth of a garden plant:
- Record the plant heights at regular intervals.
- Verify the absence of errors, such as missing dates or unrealistic growth spurts.
- Ensure the dates and heights are correctly ordered, similar to arranging your cookies before baking.
Organized data leads to insightful plots and an accurate reflection of your story.
ggplot2 is our toolkit for creating plots using R. Typically, we provide ggplot2
with a dataframe and map the variables to plot using the aes()
function. Then, we layer other functions such as geom_line()
on top to specify the type of plot we want. Now, let's plot the heights of our garden plant over two weeks:
R1# Load the ggplot2 library 2library(ggplot2) 3 4# Our garden plant's height data 5days <- c(1:14) # From Day 1 to Day 14 6plant_heights <- c(5, 5.5, 6, 6.5, 7, 7.2, 7.5, 8, 8.3, 8.6, 9, 9.2, 9.5, 10) # Heights in inches 7 8# Construct a data frame 9df <- data.frame(days, plant_heights) 10 11# Plotting the data 12plot <- ggplot(df, aes(x=days, y=plant_heights)) + 13 geom_line()
As you can see, in aes
function we assign x
to our x-axis data and y
to our y-axis data.
We put the plot into the plot
variable. It is not necessary in R, but is necessary in codesignal environment!
Upon executing this code, we observe a graph depicting our plant's growth — a line ascending over two weeks.
Just as you would pick a frame for a picture, a plot needs customization to enhance its storytelling. Adding a title, labels, and style can transform raw data into an understandable and visually appealing narrative.
Now, let's label our plant growth plot to indicate days and heights:
R1plot <- ggplot(df, aes(x=days, y=plant_heights)) + 2 geom_line() + 3 labs(title='Garden Plant Growth Over Two Weeks', 4 x='Day', 5 y='Height (inches)')
Adding these customizations guides viewing audiences through the data story, much like how chapter titles guide a reader through a book.
Fantastic! You've delved into the essentials of line plotting in R. You are now adept at preparing the correct data, building a basic line plot using ggplot2, and customizing it to tell the whole story. Remember that, as with painting or writing, the art of plotting improves with practice.
As you proceed to the practice exercises, visualize your data's story — a narrative interwoven with points and lines, which is poised to enlighten and inform. We wish you happy plotting on your journey towards becoming a master data storyteller!