Hello again! I hope you enjoyed the previous unit on understanding and using the COUNT
function. Today, we're going to expand your SQL knowledge further by introducing a new concept: the DISTINCT
keyword in SQL. While DISTINCT
is not strictly speaking a function it's a powerful keyword/clause that will help you strengthen your data querying skills.
When handling data, especially large datasets, we often encounter duplicate values. The DISTINCT
keyword helps us remove these duplicates and present a clean, unique list of values. Isn't that handy?!
This concept is common in many areas of life. Imagine you're trying to create a list of all the cities from which your friends come. If three friends come from New York, should we count New York three times? Of course not! We use distinct or unique values, and SQL provides the DISTINCT
keyword to do just that with our data.
As we learn SQL using Messi's matches, we'll be using DISTINCT
to explore his distinct sport footprints.
Given Lionel Messi's rich game history, utilizing DISTINCT
is quite handy. We can use it to identify unique seasons, and much more. It will broaden our understanding of his matches data.
Let's first learn the format of a basic SQL query that uses DISTINCT
.
SQL1SELECT DISTINCT column_name FROM table_name;
It is time to apply this format to our Lionel Messi's matches dataset. Consider the following statement:
SQL1SELECT DISTINCT season_id FROM Matches; 2 3-- Sneak peek of the output: 4-- season_id 5-- ----------- 6-- 1 7-- 2
This query will fetch all the distinct or unique season_id
from the Matches
table. We use DISTINCT
to avoid getting repeated IDs in the output.
Let's look at a few more examples using the DISTINCT
keyword.
What if we want to know all the unique clubs Messi has played for? Simple, we would run:
SQL1SELECT DISTINCT club_id FROM Matches; 2 3-- Output: 4-- club_id 5-- --------- 6-- 1 7-- 2
When using DISTINCT
, it's important to remember that fetching unique values from large datasets can be time-consuming and slow down your queries. Therefore, always consider the performance implications and use DISTINCT
only when necessary.
That wraps up our introduction to the DISTINCT
keyword in SQL! Well done on expanding your SQL toolkit. Today, you learned to use DISTINCT
to fetch unique values from a database, which is handy in many data analysis scenarios. We also explored how distinct can be applied to Lionel Messi's data for deeper insights.
Now, it's time to apply your newfound knowledge in practice! Next up, you'll be tackling some hands-on exercises using DISTINCT
with the Lionel Messi dataset. Happy coding!