Hello again! In our previous lessons, we explored the functionality of Logical AND/OR operations and delved into conditional operators in SQL such as LIKE
, IN
, and BETWEEN
. We delved deeply into how these tools not only allow us to filter and retrieve data accurately but also provide more nuanced control over our queries. In this unit, we'll build on those skills and introduce a new SQL concept—subqueries—which will enable us to perform even more complex data analysis.
So, what is a subquery? A subquery, also known as an "inner query" or "nested query," is a query nested within another SQL query. It allows us to solve convoluted problems that require multiple steps, making our SQL statements even more powerful. Essentially, a subquery can retrieve data for the primary or outer SQL query to utilize. Like other SQL queries you're now familiar with, they begin with a SELECT
statement and conclude with an appropriate clause such as WHERE
or FROM
.
For instance, let's consider a straightforward subquery that calculates the average minute of events in the MatchEvents
table:
SQL1SELECT AVG(minute) FROM MatchEvents; 2 3-- Output: 4-- AVG(minute) 5-- ------------- 6-- 51.828125
This subquery, when used within a larger query, can help us compare each event's minute to the average and filter out events that meet certain conditions.
Subqueries can be nested within other subqueries or queries. Nesting is the act of placing one item inside another. In the case of subqueries, nesting results in an outer query, possibly containing one or more subqueries. These subqueries can, in turn, contain further subqueries, enabling SQL to solve exceptionally complex tasks.
While the concept might appear intimidating initially, don't worry! The beauty of nested subqueries is that they can always be broken down into smaller, easier-to-understand steps.
Let's examine a straightforward example before we proceed to break down the final task. Suppose we want to find matches where the minute of an event is greater than the average minute of all events. Here's how we could accomplish this with a subquery:
SQL1SELECT
2 m.match_id AS MatchID,
3 me.minute AS Minute,
4 me.event_type AS EventType
5FROM Matches m
6INNER JOIN MatchEvents me ON m.match_id = me.match_id
7WHERE me.minute > (
8 SELECT AVG(minute)
9 FROM MatchEvents
10);
11
12-- Sneak peek of the output:
13-- | MatchID | Minute | EventType |
14-- |---------|--------|------------------------|
15-- | 1 | 90+1 | Left-footed shot |
16-- | 5 | 82 | Left-footed shot |
In this example:
- We introduce aliases
m
forMatches
andme
forMatchEvents
to simplify references within the query. - First, we perform an
INNER JOIN
on theMatches
andMatchEvents
tables on thematch_id
field. - Next, we filter using a
WHERE
clause that compares each event's minute to the average minute (SELECT AVG(minute) FROM MatchEvents
) of all events. This is our subquery. - This query retrieves matches where at least one event occurred later than the average minute of events.
You have now seen how to use subqueries to tackle complex SQL problems!
Great job making it this far! In this lesson, we learned about subqueries and their ability to break complex SQL tasks down into manageable parts. We firmly believe in learning by doing, so we have arranged a set of practice exercises where you can apply the concept of subqueries and become more comfortable with them.
In the next lesson, we'll continue to delve deeper into more advanced SQL query techniques. Keep up the great work! You're doing amazing!