Lesson 2

Greetings, programming enthusiast! In this unit, we're embarking on a thrilling numerical quest, where unidentified bridges connect the islands of data. On these bridges, we'll see hashes and bins, all converging into sets! Throughout our journey, we'll utilize the fundamental concepts of Python's built-in data type, the ** set**, to formulate an optimal solution. So, fasten your seatbelt and get ready to solve problems!

The task for this unit is to devise a Python function that accepts two lists containing unique integers and returns another list containing the elements common to both input lists. This task provides an intriguing perspective on deriving similarities between two data sequences, a scenario commonly encountered in data comparisons and analytics.

For illustration, suppose we're given two lists:

Python`1list1 = [1, 2, 3, 5, 8, 13, 21, 34] 2list2 = [2, 3, 5, 7, 13, 21, 31]`

The `common_elements(list1, list2)`

function should comb through these arrays of integers and extract the common elements between them.

The expected outcome in this case should be:

Python`1[2, 3, 5, 13, 21]`

Before we delve into the optimized solution, it is instrumental to consider a basic or naïve approach to this problem and analyze its complexity. Often, our first intuitive approach is to iterate over both arrays in nested loops and find common elements. This way, for each element in the first list, we check for its presence in the second list. If it's found, it is added to our result set. Let's see how such a solution would look like:

Python`1def common_elements_slow(list1, list2): 2 common = [] # list to store common elements 3 for num1 in list1: 4 for num2 in list2: 5 if num1 == num2: 6 common.append(num1) 7 break # break inner loop as we found the number in list2 8 return common`

However, the problem with this approach lies in its efficiency. Given that the worst-case scenario has us traversing through every element in both lists, we refer to this as an $O(n*m)$ solution, where n and m represent the number of elements in `list1`

and `list2`

respectively. For large lists, this iterative approach tends to be inefficient and slow, making it a less desirable solution for this problem.

The solution we aim to implement in the following section utilizes a set data structure to optimize our algorithm and reach a solution in a markedly less computational time.

Since efficiency is a concern in our previous approach, we want to reduce the time complexity by minimizing the number of operations we perform to reach a solution. The data structure `set`

provided by Python comes to our rescue here.

Sets internally use a hash table data structure which allows operations like `addition`

, `removal`

, and `search`

to be performed in constant time (on average), i.e., $O(1)$. They also provide a direct method, 'intersection', to find common elements which are executed in linear time, i.e., $O(min(n, m))$. Thus, opting for our approach using sets can significantly reduce our time complexity as compared to our iterative approach. Since we need to sort the result in the end, it makes the total complexity of the algorithm $O(min(n, m)log(min(n, m)))$. However, it is still better than $O(n*m)$.

Let's proceed to building this optimized solution in the next section.

The initial step in crafting our solution is to transform these lists into Python's built-in datatype: sets. The computation of operations, like intersection, union, and difference, is highly optimized in sets. We'll leverage this optimization to our advantage.

Python`1def common_elements(list1, list2): 2 set1 = set(list1) 3 set2 = set(list2)`

Having converted our data structure from a list to a set, we're now ready to identify the common elements between the two datasets. The Python set operation `&`

allows us to perform the intersection operation seamlessly and swiftly.

Python`1def common_elements(list1, list2): 2 set1 = set(list1) 3 set2 = set(list2) 4 common = set1 & set2`

In this final step, we convert our set of common elements back into a list and sort the integers in ascending order. The `sorted`

function in Python sorts a sequence and returns a new sorted list, leaving the original sequence untouched.

Python`1def common_elements(list1, list2): 2 set1 = set(list1) 3 set2 = set(list2) 4 common = set1 & set2 5 return sorted(list(common))`

And there you have it, the solution is duly wrapped and ready!

The `set`

data type in Python is not only efficient but also provides a plethora of useful methods for performing various operations. Let's explore some of these operations:

**Union (**The union is a method or operation that combines all unique elements from two sets. Essentially, the resultant set will contain all elements present in both the sets. The`|`

):`|`

operator or the`union()`

function can be used to perform this operation. Time complexity -`O(len(set1)+len(set2))`

.

Python`1set1 = {1, 2, 3, 4} 2set2 = {3, 4, 5, 6} 3union_set = set1 | set2 4print(union_set) # prints: {1, 2, 3, 4, 5, 6}`

**Difference (**The difference operation removes the elements present in the second set from the first set. The`-`

):`-`

operator or the`difference()`

function performs this operation. Time complexity -`O(len(set1))`

.

Python`1set1 = {1, 2, 3, 4} 2set2 = {3, 4, 5, 6} 3diff_set = set1 - set2 4print(diff_set) # prints: {1, 2}`

**Symmetric Difference (**It returns a set containing elements which are in either of the sets but not in both. In other words, it outputs elements that only reside in one of the sets. The`^`

):`^`

operator or the`symmetric_difference()`

function executes this operation. Time complexity -`O(len(set1))`

.

Python`1set1 = {1, 2, 3, 4} 2set2 = {3, 4, 5, 6} 3sym_diff_set = set1 ^ set2 4print(sym_diff_set) # prints: {1, 2, 5, 6}`

**Subset (**This operation checks if the first set is a subset of the second set. It returns`<=`

):`True`

if all elements of the first set are present in the second set, otherwise`False`

. The`<=`

operator or the`issubset()`

function achieves this.ime complexity -`O(len(set2))`

.

Python`1set1 = {1, 2, 3} 2set2 = {1, 2, 3, 4, 5} 3print(set1 <= set2) # prints: True`

Knowing these operations will allow you to use Python sets to their full potential and help you devise efficient solutions for a variety of problems.

Python's `set`

datatype hands us a powerful capability for checking membership using the `in`

operator. Performances of membership operations with sets in Python are light years ahead when compared to lists.

Let's understand this with a simple code snippet:

Python`1my_set = {1, 2, 3, 4, 5} 2print(3 in my_set) # prints: True 3print(6 in my_set) # prints: False`

In this code, `3 in my_set`

checks if 3 is present in `my_set`

and returns `True`

if it exists, `False`

otherwise. Similarly, `6 in my_set`

checks for 6 in `my_set`

and as it doesn't exist in the set, it returns `False`

.

The magic lies in the time complexity of this process. The `in`

operator works in constant time, O(1), when used with sets. This is contrasted with its counterpart in lists that works in linear time, O(n).

Well done! You've demonstrated a commendable understanding of lists and sets, along with their operations in Python. It is rare to come across solutions that marry elegance with performance efficiency, but today's task offered us precisely that opportunity, and you've seized it superbly.

Of course, the journey doesn't end here. Now, it's time for you to explore similar challenges in the following practice session. Don't be afraid to experiment with various data sequences and maximize your learning venture. Happy Coding!