Lesson 5

Great job on making it this far! Thus far, we've covered a great deal, from drilling into `COUNT`

and `DISTINCT`

to exploring `SUM`

and `GROUP BY`

. These are some of the key **SQL functions** required to dig deep into any dataset. In this unit, we're going to broaden our repertoire by applying these aggregate functions to our beloved Taylor Swift's music data.

As you may recall from our previous lessons, aggregate functions allow us to perform calculations on a set of values to return a single scalar value. We've already seen the `COUNT`

and `SUM`

functions in action, but have you ever wondered if we could derive other useful insights, such as averages? That’s where the SQL `AVG`

function comes into play.

At this juncture, the `SUM`

function must seem pretty familiar to you. It does the heavy lifting when we need to find total values. For instance, it calculates total sales or total song durations in our case.

On the other hand, the `AVG`

function might be new to you. It's a classic SQL function utilized for calculating the arithmetic mean of a set of values. Simply put, `AVG`

can help us determine an average value, such as the average popularity of Taylor Swift's songs.

`SQL````
1-- Aggregate total duration of each album
2SELECT Albums.AlbumName, SUM(Songs.DurationMS) as TotalDuration
3FROM Albums
4INNER JOIN Songs ON Albums.AlbumID = Songs.AlbumID
5GROUP BY Albums.AlbumName;
```

In the above example, we're using the `SUM`

function to find the total duration of each album in our Taylor Swift dataset. This is accomplished by joining the `Albums`

and `Songs`

tables on `AlbumID`

. The `GROUP BY`

clause ensures we get a total duration for each album, rather than for the entire collection of songs.

`SQL````
1-- Aggregate popularity of songs and group by album
2SELECT Albums.AlbumName, AVG(Songs.Popularity) as AveragePopularity
3FROM Albums
4INNER JOIN Songs ON Albums.AlbumID = Songs.AlbumID
5GROUP BY Albums.AlbumName;
```

Here, we're introducing the `AVG`

function to find the average popularity of songs in each of Taylor Swift's albums. Much like the previous example, we join the `Albums`

and `Songs`

tables, but this time we use `AVG`

to calculate the average popularity. As always, our trusty `GROUP BY`

clause is ensuring we receive an average popularity score for each album.

From our past lessons, you should recall that the `GROUP BY`

clause groups a result into subsets that share the same attribute value. It’s a vital component when using aggregate functions like `SUM`

, `COUNT`

, `AVG`

, and others because it enables us to apply these functions to each group of data independently, providing us with insightful segmented data.

As you've noticed in our examples, `GROUP BY`

plays an essential role when using aggregate functions. We use `GROUP BY`

to return a separate sum for each album and an individual average for each one.

Excellent work on learning how to use the `SUM`

and `AVG`

functions and mastering their symbiotic relationship with the `GROUP BY`

clause. Using these functions isn't always straightforward, but with practice, it will become second nature.

Congratulations on completing the last lesson of the course! Let's do some practice to turn this knowledge into skills and you'll have yet another course under your belt.