Lesson 2
Understanding Sets in Java
Understanding Sets in Java

Welcome to our Java Sets lesson! In Java, sets are represented by the HashSet<E> collection, which can only hold unique elements. They are particularly useful when you need to ensure that all elements in a collection are distinct.

In this lesson, you'll learn how to create and operate on sets using HashSet<E>. You'll explore the advantages of using sets and how they can optimize performance. Let's get started!

Creating Sets

In Java, you can create a set using the HashSet<E> class. Here is an example:

Java
1import java.util.HashSet; 2 3public class Solution { 4 public static void main(String[] args) { 5 HashSet<Integer> mySet = new HashSet<>(); 6 mySet.add(1); 7 mySet.add(2); 8 mySet.add(3); 9 mySet.add(4); 10 mySet.add(5); 11 mySet.add(5); // Adding duplicate 12 13 System.out.println(mySet); // Output: [1, 2, 3, 4, 5] 14 } 15}

You can use the add method to add elements to the set. Note that duplicates will be omitted, as sets can only contain unique elements.

Manipulating Sets

Java provides methods to manipulate sets, such as add, contains, remove, and clear.

Java
1import java.util.HashSet; 2 3public class Solution { 4 public static void main(String[] args) { 5 HashSet<Integer> mySet = new HashSet<>(); 6 mySet.add(1); 7 mySet.add(2); 8 mySet.add(3); 9 mySet.add(4); 10 mySet.add(5); 11 12 // Adding an element 13 mySet.add(6); // `mySet` is now [1, 2, 3, 4, 5, 6] 14 System.out.println(mySet.contains(1)); // Output: true, as `mySet` includes element 1 15 16 // Removing an element 17 mySet.remove(1); // `mySet` becomes [2, 3, 4, 5, 6] 18 System.out.println(mySet.contains(1)); // Output: false, as `mySet` doesn't include 1 anymore 19 20 // Clearing the set 21 mySet.clear(); // `mySet` becomes an empty set 22 System.out.println(mySet.size()); // Output: 0 23 } 24}
  • add: Adds a specified element to the set.
  • contains: Checks if the specified element exists in the set.
  • remove: Removes a specified element from the set.
  • clear: Removes all elements from the set.
Set Operations

Java provides built-in methods for operations such as union, intersection, and difference using addAll, retainAll, and removeAll.

Java
1import java.util.HashSet; 2 3public class Solution { 4 public static void main(String[] args) { 5 HashSet<Integer> set1 = new HashSet<>(); 6 HashSet<Integer> set2 = new HashSet<>(); 7 8 // Initializing set1 9 set1.add(1); 10 set1.add(2); 11 set1.add(3); 12 set1.add(4); 13 14 // Initializing set2 15 set2.add(3); 16 set2.add(4); 17 set2.add(5); 18 set2.add(6); 19 20 // Set union 21 HashSet<Integer> union = new HashSet<>(set1); 22 union.addAll(set2); 23 System.out.println(union); // Output: [1, 2, 3, 4, 5, 6] 24 25 // Set intersection 26 HashSet<Integer> intersection = new HashSet<>(set1); 27 intersection.retainAll(set2); 28 System.out.println(intersection); // Output: [3, 4] 29 30 // Set difference 31 HashSet<Integer> difference = new HashSet<>(set1); 32 difference.removeAll(set2); 33 System.out.println(difference); // Output: [1, 2] 34 } 35}
  • addAll: Combines elements from both sets, excluding any duplicates. This results in a set containing [1, 2, 3, 4, 5, 6].
  • retainAll: Returns a set with only the elements common to both sets. For these sets, the intersection is [3, 4].
  • removeAll: Returns a set containing elements that are in the first set but not in the second set. Here, the result is [1, 2] for set1.
Performance Benefits of Sets

One of the key advantages of sets is their faster performance in membership tests, thanks to their use of hashing.

Java
1import java.util.HashSet; 2import java.util.List; 3import java.util.ArrayList; 4import java.util.Random; 5 6public class Solution { 7 public static void main(String[] args) { 8 Random random = new Random(); 9 HashSet<Integer> mySet = new HashSet<>(); 10 List<Integer> myList = new ArrayList<>(); 11 12 for (int i = 0; i < 10000000; i++) { 13 mySet.add(i); 14 myList.add(i); 15 } 16 17 // Warm-up 18 mySet.contains(-1); 19 myList.contains(-1); 20 21 long startTime, endTime; 22 23 // Measure HashSet performance 24 startTime = System.nanoTime(); 25 for (int i = 0; i < 1000; i++) { 26 boolean result = mySet.contains(random.nextInt(10000000)); 27 } 28 endTime = System.nanoTime(); 29 System.out.println("Set: " + ((endTime - startTime) / 1_000_000) + "ms"); // Outputs: Set: ~2ms on average 30 31 // Measure List performance 32 startTime = System.nanoTime(); 33 for (int i = 0; i < 1000; i++) { 34 boolean result = myList.contains(random.nextInt(10000000)); 35 } 36 endTime = System.nanoTime(); 37 System.out.println("List: " + ((endTime - startTime) / 1_000_000) + "ms"); // Outputs: List: ~18223ms on average 38 } 39}
  • Membership Test with HashSet<E>: Thanks to hash tables, sets can check for membership in constant time, leading to quick lookup times. The membership checking in the set is remarkably fast.
  • Membership Test with List: Lists require a linear search to check for membership, resulting in longer lookup times as the list grows. The membership checking in the list is noticeably slower.

Hashing is the key to HashSet's efficiency. When you attempt to add a duplicate, Java checks if the hashCode() is already in the set. If so, it compares the new element with the existing one. Each element is placed into a “bucket” based on its hashCode, so finding an element can often be done in constant time. In contrast, lists require a linear search, where each item must be checked, resulting in slower performance as the list size grows.

Lesson Summary

Congratulations! You've explored creating and manipulating sets, performing set operations, and understanding the performance benefits of sets in Java.

Remember, practice is key to solidifying your understanding. Happy coding!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.