Lesson 3
Mutating and Arranging Data
Mutating and Arranging Data

Welcome back! Up until now, we've covered some essential techniques for managing your datasets with the dplyr package in R. We've learned how to select specific columns, filter rows based on conditions, and summarize and group data. Now, it's time to take your data manipulation skills to the next level by learning how to mutate (or transform) data and arrange it in a specific order.

What You'll Learn

In this lesson, you'll explore two critical functionalities:

  1. Mutating Data: Adding or transforming columns in your data frame using the mutate function.
  2. Arranging Data: Sorting your data in a specific order using the arrange function.

We'll use straightforward examples to make these concepts easy to grasp. Let’s dive into each of these functionalities.

Example Data Frame

First, we'll set up an example data frame that we'll use throughout this lesson:

R
1# Example data frame 2data <- data.frame( 3 Name = c("Alice", "Bob", "Charlie", "David"), 4 Score = c(85, 95, 78, 92) 5) 6 7# Print the example data frame 8print(data) 9 10# Output: 11# Name Score 12# 1 Alice 85 13# 2 Bob 95 14# 3 Charlie 78 15# 4 David 92

This data frame contains the names of four individuals along with their respective scores.

Adding a New Column with mutate

The mutate function allows us to add new columns or transform existing ones. For instance, suppose we want to add a new column, ScorePlus10, which is each person's score incremented by 10.

R
1# Example data frame 2data <- data.frame( 3 Name = c("Alice", "Bob", "Charlie", "David"), 4 Score = c(85, 95, 78, 92) 5) 6 7# Add a new column 8mutated_data <- mutate(data, ScorePlus10 = Score + 10) 9 10# Print the mutated data 11print(mutated_data) 12 13# Output: 14# Name Score ScorePlus10 15# 1 Alice 85 95 16# 2 Bob 95 105 17# 3 Charlie 78 88 18# 4 David 92 102

Here, mutate adds a new column called ScorePlus10 to the data frame, where each entry is the original Score plus 10.

Arranging Data with arrange

The arrange function helps us sort the data in a specific order. For example, to sort the data by Score in descending order, we can do the following:

R
1# Example data frame 2data <- data.frame( 3 Name = c("Alice", "Bob", "Charlie", "David"), 4 Score = c(85, 95, 78, 92) 5) 6 7# Add a new column 8mutated_data <- mutate(data, ScorePlus10 = Score + 10) 9 10# Arrange data by Score in descending order 11arranged_data <- arrange(mutated_data, desc(Score)) 12 13# Print the descending arranged data 14print(arranged_data) 15 16# Output: 17# Name Score ScorePlus10 18# 1 Bob 95 105 19# 2 David 92 102 20# 3 Alice 85 95 21# 4 Charlie 78 88

In this code snippet, arrange sorts the mutated_data data frame by the Score column in descending order.

To sort the data by Score in ascending order, we can do the following:

R
1# Example data frame 2data <- data.frame( 3 Name = c("Alice", "Bob", "Charlie", "David"), 4 Score = c(85, 95, 78, 92) 5) 6 7# Add a new column 8mutated_data <- mutate(data, ScorePlus10 = Score + 10) 9 10# Arrange data by Score in ascending order 11arranged_data_asc <- arrange(mutated_data, Score) 12 13# Print the ascending arranged data 14print(arranged_data_asc) 15 16# Output: 17# Name Score ScorePlus10 18# 1 Charlie 78 88 19# 2 Alice 85 95 20# 3 David 92 102 21# 4 Bob 95 105

Here, arrange sorts the mutated_data data frame by the Score column in ascending order. No need to use any function for ascending order, as it is the default behavior of arrange.

Why It Matters

Mutating and arranging data are foundational skills in data wrangling.

  • Mutating Data: This technique allows you to create new variables or transform existing ones based on your needs. It's useful for tasks such as feature engineering in machine learning, where you may need to create new features from raw data.

  • Arranging Data: Sorting your data helps you see patterns more clearly and make your datasets more readable. For example, arranging sales data from the highest to the lowest can help you immediately spot your top-performing products.

By mastering these functions, you'll be better equipped to prepare your data for analysis and reporting, ensuring you draw more meaningful insights from your datasets.

Excited to start mutating and arranging data? Let's jump into the practice section and get hands-on with these powerful techniques.

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.