Lesson 2
Utilizing DISTINCT for Uniqueness Exploration
Introduction to DISTINCT

Hello again! I hope you enjoyed the previous unit on understanding and using the COUNT function. Today, we're going to expand your SQL knowledge further by introducing a new concept: the DISTINCT keyword in SQL. While DISTINCT is not strictly speaking a function, it's a powerful keyword/clause that will help you strengthen your data querying skills.

When handling data, especially large datasets, we often encounter duplicate values. The DISTINCT keyword helps us remove these duplicates and present a clean, unique list of values. Isn't that handy?!

This concept is common in many areas of life. Imagine you're trying to create a list of all the product categories offered in an online shop. If there are multiple products in the same category, should we list the category multiple times? Of course not! We use distinct or unique values, and SQL provides the DISTINCT keyword to do just that with our data.

As we learn SQL in the context of an online shopping dataset, we'll be using DISTINCT to explore unique product categories.

The Need for DISTINCT in Our Database

Given the diverse range of products in online shopping, utilizing DISTINCT is quite handy. We can use it to identify unique categories, and much more. It will broaden our understanding of the products data.

Let's first learn the format of a basic SQL query that uses DISTINCT.

SQL
1SELECT DISTINCT column_name FROM table_name;

It is time to apply this format to our online shopping dataset. Consider the following statement:

SQL
1SELECT DISTINCT category_id FROM Products; 2 3-- Output: 4-- category_id 5-- ------------ 6-- 1 7-- 2 8-- 3 9-- 4 10-- 5

This query will fetch all the distinct or unique category_id from the Products table. We use DISTINCT to avoid getting repeated category IDs in the output.

More Applications of DISTINCT

Let's look at a few more examples using the DISTINCT keyword.

What if we want to know all the unique products that have been ordered from our online shop? Simple, we would run:

SQL
1SELECT DISTINCT product_id FROM OrderItems; 2 3-- Sneak peek of the output: 4-- product_id 5-- ------------ 6-- 1 7-- 6 8-- 7 9-- 10

The DISTINCT keyword can be applied to multiple columns to find unique combinations of values. For example, let's suppose you want to find all unique combinations of customer_id and order_status in the Orders table:

SQL
1SELECT DISTINCT customer_id, order_status 2FROM Orders; 3 4-- Sneak peek of the output: 5-- | customer_id | order_status | 6-- |-------------|--------------| 7-- | 41 | Delivered | 8-- | 16 | Processed | 9-- | 44 | Refunded | 10-- | 6 | Processed |

This query returns unique pairs of customer_id and order_status. Rows with identical customer_id values but different order_status values will be treated as distinct.

When using DISTINCT, it's important to remember that fetching unique values from large datasets can be time-consuming and slow down your queries. Therefore, always consider the performance implications and use DISTINCT only when necessary.

Lesson Wrap-Up and Practice Preview

That wraps up our introduction to the DISTINCT keyword in SQL! Well done on expanding your SQL toolkit. Today, you learned to use DISTINCT to fetch unique values from a database, which is handy in many data analysis scenarios. We also explored how distinct can be applied to an online shopping dataset for deeper insights.

Now, it's time to apply your newfound knowledge in practice! Next up, you'll be tackling some hands-on exercises using DISTINCT with the online shopping dataset. Happy coding!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.