Great to see you back! You've previously learned how to create and manipulate data frames and perform essential calculations. This next step, data aggregation, will take your skills even further. You'll learn how to group data and perform operations on these groups, which is critical for summarizing and understanding large datasets.
In this lesson, you'll discover how to:
In the practice section, we'll also learn how to count the number of people in each group!
Here's a sneak peek of what we'll be working on:
R1# Create a data frame 2df <- data.frame( 3 ID = 1:10, 4 Name = c("John", "Jane", "Alex", "Emily", "David", "Eva", "Liam", "Noah", "Sophia", "Mason"), 5 Age = c(28, 22, 35, 29, 40, 25, 34, 37, 28, 31), 6 Salary = c(50000, 60000, 70000, 80000, 90000, 55000, 72000, 78000, 59000, 65000) 7) 8 9# Add an AgeGroup column 10df$AgeGroup <- cut(df$Age, breaks=c(20, 30, 40, 50), right=FALSE) 11 12# Aggregate data: calculate average salary by age group 13avg_salary_by_age_group <- aggregate(Salary ~ AgeGroup, data=df, FUN=mean) 14print(avg_salary_by_age_group)
By the end of this lesson, you'll have a strong grasp of how to summarize and interpret your data using aggregation techniques. More examples and practices will follow in the practice section to reinforce your learning.
Data aggregation is crucial in data analysis for several reasons:
With these skills, you'll be able to handle more complex data analysis tasks and present well-structured insights. Excited to dive deeper? Let's proceed to the practice section together!