Set Operations using Maps in Go

Lesson 5

Introduction to Operating Sets in Go

Welcome back! Building on our previous unit, today we're diving into Go's approach to set operations using maps. Similar to how a club assigns unique membership IDs, maps ensure each key is unique. Throughout the session, you'll see how maps can simplify tasks involving ensuring uniqueness and checking set intersections. Let's explore how maps can transform lengthy, cumbersome operations into efficient, elegant code.

Problem 1: Check if Two Sets are Disjoint

Imagine you're developing a feature for a social media platform that requires user groups to be exclusive — you need to ensure that users can't belong to more than one group at a time. It's like organizing events where a guest should not appear on the lists for two different parties at the same venue — an overlap would be a significant issue.

Naive Approach

Initially, you might consider checking for overlap by comparing each member of one group with every member of the other — a somewhat cumbersome $O(n \cdot m)$ operation. If you have hundreds or thousands of users in each group, the time it would take to compare them all grows exponentially. This approach is impractical and resource-intensive, especially on the scale of a social media platform with potentially millions of users.

Go
1func AreDisjoint(arr1, arr2 []int) bool {
2    for _, num1 := range arr1 {
3        for _, num2 := range arr2 {
4            if num1 == num2 {
5                return false // An overlap is found.
6            }
7        }
8    }
9    return true // No overlaps found, sets are disjoint.
10}

Efficient Approach

Instead, maps in Go provide a swift and efficient method for achieving the same result. Here's how you can implement it:

Go
1func AreDisjoint(arr1, arr2 []int) bool {
2    set1 := make(map[int]struct{})
3    for _, num := range arr1 {
4        set1[num] = struct{}{} // Populate the map using struct{} to save space
5    }
6  
7    for _, num := range arr2 {
8        if _, exists := set1[num]; exists {
9            return false // If found, the sets are not disjoint.
10        }
11    }
12    return true
13}

Notice the _, exists := set1[num]; exists part of the if statement. It checks if num is a key in set1. It assigns the value associated with num and a boolean exists indicating whether the key exists. The underscore _ ignores the actual value retrieved, focusing instead on the existence of the key for control flow.

Using maps with struct{} provides significant speed advantages while minimizing memory usage. First, we iterate over arr1 to populate set1, marking each element. This operation takes $O(n)$ time, where n is the length of arr1. Then, it iterates over arr2, checking if any element exists in set1. The existence check in a map is a constant-time operation, $O(1)$ , resulting in an overall time complexity of $O(m)$ for the second loop, where m is the length of arr2. Thus, the overall time complexity of the function is $O(n + m)$ .

Problem 2: Remove Duplicates from an Array

Consider a scenario where you have a list of email addresses but must ensure each customer receives only one newsletter — duplicates must go. This scenario is akin to managing invitations to an exclusive gala, where each person should receive only one invite, meaning the invitation list must be free of repeats.

Naive Approach

The naive approach to this problem would be to create a new list and check every incoming address against all previously added ones — resulting in an inefficient $O(n^2)$ operation. Such an approach would not scale well with larger datasets and could lead to significant delays.

Go
1func RemoveDuplicates(arr []int) []int {
2    uniqueList := []int{}
3    for _, num := range arr {
4        contains := false
5        for _, uniqueNum := range uniqueList {
6            if num == uniqueNum {
7                contains = true
8                break
9            }
10        }
11        if !contains {
12            uniqueList = append(uniqueList, num) // Add number if it's not already in the list
13        }
14    }
15    return uniqueList
16}

Efficient Approach

By using a map's inherent capability to prevent duplicates, we can effectively streamline the process:

Go
1func RemoveDuplicates(arr []int) []int {
2    nums := make(map[int]struct{})
3    for _, num := range arr {
4        nums[num] = struct{}{} // Add the number if it's not already present, ignoring duplicates
5    }
6
7    result := make([]int, 0, len(nums))
8    for num := range nums {
9        result = append(result, num) // Each unique element is added to the result slice
10    }
11    return result
12}

RemoveDuplicates first iterates through the array, adding each element to nums, which takes $O(n)$ . Since map insertion is an average $O(1)$ operation, duplicates are inherently ignored. Then, we construct the result slice, iterating over the keys of nums, which also takes $O(n)$ time in the worst case. Consequently, the overall time complexity of the function is $O(n)$ .

Now we have a clean list ready for our exclusive newsletter send-out. The map optimizes our process, scaling efficiently for larger datasets.

Lesson Summary

Reflecting on today's lesson, we've explored the practical utility of Go's maps to achieve set operations, transitioning a conversation about uniqueness into user-friendly and optimal code. We delved into two practical examples, evaluating the pitfalls of naive implementations and recognizing the benefits of using maps to overcome them efficiently and gracefully. The key takeaway is the importance of optimizing operations for large datasets with maps’ efficient membership checks. With this newfound appreciation for maps, it's time for practice!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.