Lesson 5
Applying SQL Aggregate Functions to Soccer Data
Quick Recap

Great job on making it this far! Thus far, we've covered a great deal, from drilling into COUNT and DISTINCT to exploring SUM and GROUP BY. These are some of the key SQL functions required to dig deep into any dataset. In this unit, we're going to broaden our repertoire by applying these aggregate functions to analyze data related to Lionel Messi's career achievements.

As you may recall from our previous lessons, aggregate functions allow us to perform calculations on a set of values to return a single scalar value. We've already seen the COUNT and SUM functions in action, but have you ever wondered if we could derive other useful insights, such as averages? That’s where the SQL AVG function comes into play.

SUM and AVG Functions

At this juncture, the SUM function must seem pretty familiar to you. It does the heavy lifting when we need to find total values. For instance, it calculates total trophies won or total goals scored in our case.

On the other hand, the AVG function might be new to you. It's a classic SQL function utilized for calculating the arithmetic mean of a set of values. Simply put, AVG can help us determine an average value, such as the average trophies won per season in Lionel Messi's career.

Example 1: Utilizing The SUM Function
SQL
1-- Aggregate total trophies won per each season 2SELECT Seasons.season_id, SUM(Seasons.trophies_won) as TotalTrophiesWon 3FROM Seasons 4JOIN Matches ON Seasons.season_id = Matches.season_id 5GROUP BY Seasons.season_id; 6 7-- Sneak peek of the output: 8-- | season_id | TotalTrophiesWon | 9-- |-----------|------------------| 10-- | 1 | 1 | 11-- | 2 | 16 |

In the above example, we're using the SUM function to find the total trophies won per season in Lionel Messi's career. This is achieved by joining the Seasons and Matches tables on season_id, where Matches records the details of matches played in each season. The GROUP BY clause ensures we get a total trophies count for each season, providing a comprehensive view of Messi's career achievements.

Example 2: Leveraging The AVG Function
SQL
1-- Aggregate average trophies won per each season after 2010 2SELECT Seasons.season_id, AVG(Seasons.trophies_won) as AverageTrophiesWon 3FROM Seasons 4JOIN Matches ON Seasons.season_id = Matches.season_id 5WHERE YEAR(Matches.date) > 2010 6GROUP BY Seasons.season_id; 7 8-- Sneak peek of the output: 9-- | season_id | AverageTrophiesWon | 10-- |-----------|--------------------| 11-- | 7 | 3.0000 | 12-- | 8 | 4.0000 | 13

Here, we're introducing the AVG function to find the average trophies won per season in Lionel Messi's career after 2010. By filtering matches based on the date condition (YEAR(Matches.date) > 2010), we focus on more recent seasons. The AVG function calculates the arithmetic mean of trophies won across these seasons, offering insights into Messi's consistent performance over time.

Remembering the GROUP BY Clause

From our past lessons, you should recall that the GROUP BY clause groups a result into subsets that share the same attribute value. It’s a vital component when using aggregate functions like SUM, COUNT, AVG, and others because it enables us to apply these functions to each group of data independently, providing us with insightful segmented data.

As you've noticed in our examples, GROUP BY plays an essential role when using aggregate functions. We use GROUP BY to return a separate sum or average for each season, allowing us to analyze Messi's career achievements in a structured manner.

You are Almost There

Excellent work on learning how to use the SUM and AVG functions and mastering their symbiotic relationship with the GROUP BY clause. Using these functions isn't always straightforward, but with practice, it will become second nature.

Congratulations on completing this lesson of the course! Let's continue practicing to solidify this knowledge and enhance your SQL skills further.

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.