Complexity Analysis and Optimization with C++ Unordered Sets

Lesson 2

Introduction

Greetings, programming enthusiast! In this unit, we're embarking on a thrilling numerical quest, where unidentified bridges connect the islands of data. On these bridges, we'll see hashes and bins, all converging into sets! Throughout our journey, we'll utilize the fundamental concepts of the C++ Standard Template Library (STL) std::unordered_set to formulate an optimal solution. So, fasten your seatbelt and get ready to solve problems!

Task Statement

The task for this unit is to devise a C++ function that accepts two vectors containing unique integers and returns another vector containing the elements common to both input vectors. This task provides an intriguing perspective on deriving similarities between two data sequences, a scenario commonly encountered in data comparisons and analytics.

For illustration, suppose we're given two vectors:

C++
1std::vector<int> list1 = {1, 2, 3, 5, 8, 13, 21, 34};
2std::vector<int> list2 = {2, 3, 5, 7, 13, 21, 31};

The common_elements(list1, list2) function should comb through these arrays of integers and extract the common elements between them.

The expected outcome in this case should be:

C++
1std::vector<int> result = {2, 3, 5, 13, 21};

Brute Force Solution and Complexity Analysis

Before we delve into the optimized solution, it is instrumental to consider a basic or naïve approach to this problem and analyze its complexity. Often, our first intuitive approach is to iterate over both arrays in nested loops and find common elements. This way, for each element in the first list, we check for its presence in the second list. If it's found, it is added to our result vector. Let's see how such a solution would look:

C++
1#include <vector>
2
3std::vector<int> common_elements_slow(const std::vector<int>& list1, const std::vector<int>& list2) {
4    std::vector<int> common; // vector to store common elements
5    for (int num1 : list1) {
6        for (int num2 : list2) {
7            if (num1 == num2) {
8                common.push_back(num1);
9                break;  // break inner loop as we found the number in list2
10            }
11        }
12    }
13    return common;
14}

However, the problem with this approach lies in its efficiency. Given that the worst-case scenario has us traversing through every element in both lists, we refer to this as an $O(n \cdot m)$ solution, where n and m represent the number of elements in list1 and list2 respectively. For large lists, this iterative approach tends to be inefficient and slow, making it a less desirable solution for this problem.

The solution we aim to implement in the following section utilizes the std::unordered_set data structure to optimize our algorithm and reach a solution in markedly less computational time.

Introduction to Unordered Set Solution

Since efficiency is a concern in our previous approach, we want to reduce the time complexity by minimizing the number of operations we perform to reach a solution. The C++ STL std::unordered_set comes to our rescue here.

Unordered sets internally use hash tables, which allows operations like insertion, removal, and search to be performed in average constant time, i.e., $O(1)$ . This provides a significant performance boost compared to the nested loop approach. Our overall time complexity will be reduced to $O(n + m)$ for traversing both lists and storing their elements into unordered sets.

Let's proceed to build this optimized solution in the next section.

Solution Building: Step 1

The initial step in crafting our solution is to transform one of these vectors into a C++ std::unordered_set. The computation of operations, like finding intersections, is highly optimized in unordered sets. We'll leverage this optimization to our advantage.

C++
1#include <unordered_set>
2#include <vector>
3
4std::vector<int> common_elements(const std::vector<int>& list1, const std::vector<int>& list2) {
5    std::unordered_set<int> set1(list1.begin(), list1.end());
6}

Solution Building: Step 2

Having converted one of our vectors to an unordered set, we're now ready to identify the common elements between the two datasets. We'll iterate through the other vector and check for each element's presence in the unordered set.

C++
1#include <vector>
2#include <unordered_set>
3#include <iostream>
4
5std::vector<int> common_elements(const std::vector<int>& list1, const std::vector<int>& list2) {
6    std::unordered_set<int> set1(list1.begin(), list1.end());
7
8    std::vector<int> common;
9    for (const int& num : list2) {
10        if (set1.find(num) != set1.end()) {
11            common.push_back(num);
12        }
13    }
14    return common;
15}
16
17int main() {
18    std::vector<int> list1 = {1, 2, 3, 5, 8, 13, 21, 34};
19    std::vector<int> list2 = {2, 3, 5, 7, 13, 21, 31};
20    std::vector<int> result = common_elements(list1, list2);
21    for (int elm : result) {
22        std::cout << elm << ' ';
23    }
24    std::cout << std::endl;
25    return 0;
26}

Additional Unordered Set Methods

The std::unordered_set in C++ is not only efficient but also provides a plethora of useful methods for performing various operations. Let's explore some of these practical methods:

1. Insertion

The insert method allows you to add elements to an unordered set. If the element already exists, the insertion does nothing. The average time complexity for insertion is $O(1)$ .

C++
1#include <iostream>
2#include <unordered_set>
3
4int main() {
5    std::unordered_set<int> my_set = {1, 2, 3};
6    my_set.insert(4);  // O(1)
7    my_set.insert(2);  // O(1) but this will have no effect since 2 is already in the set
8
9    for (int elem : my_set) {
10        std::cout << elem << " ";
11    } // prints: 1 2 3 4 (order may vary)
12}

2. Removal

The erase method removes elements from an unordered set. If the element does not exist, the erase operation does nothing. The average time complexity for removal is $O(1)$ .

C++
1#include <iostream>
2#include <unordered_set>
3
4int main() {
5    std::unordered_set<int> my_set = {1, 2, 3, 4};
6    my_set.erase(3);  // O(1)
7    my_set.erase(5);  // O(1) but this will have no effect since 5 is not in the set
8
9    for (int elem : my_set) {
10        std::cout << elem << " ";
11    } // prints: 1 2 4 (order may vary)
12}

3. Checking Membership

The find method checks for the existence of an element in an unordered set. It returns an iterator to the element if found, otherwise, it returns end(). The average time complexity for this operation is $O(1)$ .

C++
1#include <iostream>
2#include <unordered_set>
3
4int main() {
5    std::unordered_set<int> my_set = {1, 2, 3, 4, 5};
6    std::cout << (my_set.find(3) != my_set.end()) << std::endl;  // O(1), prints: 1 (true)
7    std::cout << (my_set.find(6) != my_set.end()) << std::endl;  // O(1), prints: 0 (false)
8}

4. Size

The size method returns the number of elements in an unordered set. The time complexity for this operation is $O(1)$ .

C++
1#include <iostream>
2#include <unordered_set>
3
4int main() {
5    std::unordered_set<int> my_set = {1, 2, 3, 4, 5};
6    std::cout << my_set.size() << std::endl;  // O(1), prints: 5
7}

5. Clear

The clear method removes all elements from an unordered set. The average time complexity for this operation is $O(n)$ , where n is the number of elements in the set.

C++
1#include <iostream>
2#include <unordered_set>
3
4int main() {
5    std::unordered_set<int> my_set = {1, 2, 3, 4, 5};
6    my_set.clear();  // O(n)
7    std::cout << my_set.size() << std::endl;  // O(1), prints: 0
8}

Knowing these operations will allow you to use C++ unordered sets to their full potential and help you devise efficient solutions for a variety of problems.

Lesson Summary

Well done! You've demonstrated a commendable understanding of vectors and unordered sets, along with their operations in C++. It is rare to come across solutions that marry elegance with performance efficiency, but today's task offered us precisely that opportunity, and you've seized it superbly.

Of course, the journey doesn't end here. Now, it's time for you to explore similar challenges in the following practice session. Don't be afraid to experiment with various data sequences and maximize your learning venture. Happy coding!

By following these detailed instructions, you should now have a solid basis for understanding how to operate with std::unordered_set in C++ and how to develop efficient algorithms using them.

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.