Lesson 3
Mastering Subqueries for Data Analysis
Recap of Previous Lessons

Hello again! In our previous lessons, we explored the functionality of Logical AND/OR operations and delved into conditional operators in SQL such as LIKE, IN, and BETWEEN. We delved deeply into how these tools not only allow us to filter and retrieve data accurately but also provide more nuanced control over our queries. In this unit, we'll build on those skills and introduce a new SQL concept—subqueries—which will enable us to perform even more complex data analysis.

Introduction to Subqueries

So, what is a subquery? A subquery, also known as an "inner query" or "nested query," is a query nested within another SQL query. It allows us to solve convoluted problems that require multiple steps, making our SQL statements even more powerful. Essentially, a subquery can retrieve data for the primary or outer SQL query to utilize. Like other SQL queries you're now familiar with, they begin with a SELECT statement and conclude with an appropriate clause such as WHERE or FROM.

For instance, let's consider a straightforward subquery that calculates the average minute of events in the MatchEvents table:

SQL
1SELECT AVG(minute) FROM MatchEvents; 2 3-- Output: 4-- AVG(minute) 5-- ------------- 6-- 51.828125

This subquery, when used within a larger query, can help us compare each event's minute to the average and filter out events that meet certain conditions.

Nesting Subqueries

Subqueries can be nested within other subqueries or queries. Nesting is the act of placing one item inside another. In the case of subqueries, nesting results in an outer query, possibly containing one or more subqueries. These subqueries can, in turn, contain further subqueries, enabling SQL to solve exceptionally complex tasks.

While the concept might appear intimidating initially, don't worry! The beauty of nested subqueries is that they can always be broken down into smaller, easier-to-understand steps.

Subquery Use Case Example

Let's examine a straightforward example before we proceed to break down the final task. Suppose we want to find matches where the minute of an event is greater than the average minute of all events. Here's how we could accomplish this with a subquery:

SQL
1SELECT 2 m.match_id AS MatchID, 3 me.minute AS Minute, 4 me.event_type AS EventType 5FROM Matches m 6INNER JOIN MatchEvents me ON m.match_id = me.match_id 7WHERE me.minute > ( 8 SELECT AVG(minute) 9 FROM MatchEvents 10); 11 12-- Sneak peek of the output: 13-- | MatchID | Minute | EventType | 14-- |---------|--------|------------------------| 15-- | 1 | 90+1 | Left-footed shot | 16-- | 5 | 82 | Left-footed shot |

In this example:

  • We introduce aliases m for Matches and me for MatchEvents to simplify references within the query.
  • First, we perform an INNER JOIN on the Matches and MatchEvents tables on the match_id field.
  • Next, we filter using a WHERE clause that compares each event's minute to the average minute (SELECT AVG(minute) FROM MatchEvents) of all events. This is our subquery.
  • This query retrieves matches where at least one event occurred later than the average minute of events.

You have now seen how to use subqueries to tackle complex SQL problems!

Summary and Upcoming Practice

Great job making it this far! In this lesson, we learned about subqueries and their ability to break complex SQL tasks down into manageable parts. We firmly believe in learning by doing, so we have arranged a set of practice exercises where you can apply the concept of subqueries and become more comfortable with them.

In the next lesson, we'll continue to delve deeper into more advanced SQL query techniques. Keep up the great work! You're doing amazing!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.