Hello there! In this unit, we're offering an engaging coding lesson that highlights the performance efficiencies offered by the utilization of std::unordered_map
in C++. We'll address a vector-based problem that requires us to make an optimal choice to minimize the size of our vector. Excited? So am I! Let's get started.
In this unit's task, we'll manipulate a vector of integers. You are required to construct a C++ function titled minimal_max_block()
. This function should accept a vector as an input and compute an intriguing property related to contiguous blocks within the vector.
More specifically, you must select a particular integer, k
, from the vector. Once you've selected k
, the function should remove all occurrences of k
from the vector, thereby splitting it into several contiguous blocks or remaining sub-vectors. A unique feature of k
is that it is chosen such that the maximum length among these blocks is minimized.
For instance, consider the vector {1, 2, 2, 3, 1, 4, 4, 4, 1, 2, 5}
. If we eliminate all instances of 2 (our k
), the remaining blocks would be {1}
, {3, 1, 4, 4, 4, 1}
, {5}
, with the longest containing 6 elements. Now, if we instead remove all instances of 1, the new remaining blocks would be {2, 2, 3}
, {4, 4, 4}
, {2, 5}
, the longest of which contains 3 elements. As such, the function should return 1 in this case, as it leads to a minimally maximal block length.
An initial way to approach this problem can be through a brute force method. Each possible value in the vector could be tested in turn by removing it from the vector and then checking the resulting sub-vector sizes. This approach entails iteratively stepping through the vector for each possible value in the vector.
C++1#include <vector> 2#include <unordered_set> 3#include <limits> 4#include <algorithm> 5 6int minimal_max_block_bruteforce(const std::vector<int>& vec) { 7 int min_max_block_size = std::numeric_limits<int>::max(); 8 int min_num = -1; 9 10 std::unordered_set<int> unique_elements(vec.begin(), vec.end()); 11 12 for (int num : unique_elements) { // Avoid duplicates. 13 std::vector<int> indices; 14 for (size_t i = 0; i < vec.size(); ++i) { 15 if (vec[i] == num) { 16 indices.push_back(i); 17 } 18 } 19 indices.insert(indices.begin(), -1); 20 indices.push_back(vec.size()); 21 22 int max_block_size = 0; 23 for (size_t i = 1; i < indices.size(); ++i) { 24 max_block_size = std::max(max_block_size, indices[i] - indices[i - 1] - 1); 25 } 26 27 if (max_block_size < min_max_block_size) { 28 min_max_block_size = max_block_size; 29 min_num = num; 30 } 31 } 32 33 return min_num; 34}
This method has a time complexity of O(n^2)
, as it involves two nested loops: the outer loop cycling through each potential k
value and the inner loop sweeping through the vector for each of these k
values.
However, this approach becomes increasingly inefficient as the size n
of the vector grows, due to its quadratic time complexity. For larger vectors or multiple invocations, the computation time can noticeably increase, demonstrating the need for a more efficient solution.
Before we delve into the steps of the optimized unordered_map
solution, let's understand why the proposed method is superior in terms of time and space complexity when compared to the brute force approach.
-
Time Complexity: The
unordered_map
solution only requires a single traversal of the vector. This results in a linear time complexity,O(n)
, wheren
is the number of elements in the vector. This is significantly more efficient than theO(n^2)
complexity of the brute force approach, which needs to traverse the vector for every distinct number. -
Space Complexity: The
unordered_map
solution maintains two maps for storing the last occurrence and maximum block size for each number. In a worst-case scenario, every element in the vector is unique, leading to anO(n)
space complexity, wheren
is the number of elements in the vector.
Now let's jump into the step-by-step implementation of the solution.
To find the number that, when removed from the vector, would split the vector into several contiguous blocks such that the length of the longest block is minimized, we need to track every position of the same elements, the block size when removing each of the encountered elements, and store the maximum of these blocks.
But, in this case, we don't need to store all positions. We only need the last occurrence position since the blocks we are interested in are between two adjacent same elements, the beginning of the vector and the first occurrence of an element, and between the last occurrence of an element and the end of the vector. For the first two cases, we will keep the maximum block size for each element and update it whenever we get a bigger one, and for the last case, we will process it separately after the vector traversal.
To do this, we need to:
-
Initialize two
unordered_map
s. Thelast_occurrence
map will store the last occurrence index of each number, whilemax_block_sizes
will map each number to the maximum size of the blocks formed when the number is removed. -
Traverse the vector from left to right, and for each number:
- If it's the first time we encounter the number, we regard the block from the start of the vector up to its current position as a block formed when this number is removed, and store the size of this block in
max_block_sizes
for this number. - If the number has appeared before, we calculate the size of the block it forms (excluding the number itself) by subtracting the last occurrence index from the current index and subtracting 1. If the size is larger than the current maximum stored in
max_block_sizes
for this number, we update it. - Store the current index as the last occurrence of the number.
- If it's the first time we encounter the number, we regard the block from the start of the vector up to its current position as a block formed when this number is removed, and store the size of this block in
-
After finishing the vector traversal, we need to calculate the size of the "tail block" (i.e., the block between the last occurrence of a number and the end of the vector) for each number, and update its maximum block size in
max_block_sizes
if necessary. -
Find the number that gives the smallest maximum block size and return it as the result.
First, we initialize two unordered_map
s. The last_occurrence
map stores the last occurrence index of each number, while max_block_sizes
maps each number to the maximum size of the blocks formed when the number is removed.
C++1#include <unordered_map> 2#include <vector> 3 4int minimal_max_block(const std::vector<int>& vec) { 5 std::unordered_map<int, int> last_occurrence; 6 std::unordered_map<int, int> max_block_sizes;
Next, we iterate over the vector. For each number:
- If it's the first time the number is encountered, we regard the block from the start of the vector up to its current position as a block formed when this number is removed, and store the size of this block in
max_block_sizes
for this number. - If it has appeared before, we calculate the size of the block it forms by subtracting the last occurrence index from the current index, and subtracting 1 (since block length doesn't include the number itself). We update
max_block_sizes
for this number if necessary. - Store the current index as the last occurrence of this number.
C++1 for (size_t i = 0; i < vec.size(); ++i) { 2 int num = vec[i]; 3 if (last_occurrence.find(num) == last_occurrence.end()) { 4 max_block_sizes[num] = i; 5 } else { 6 int block_size = i - last_occurrence[num] - 1; 7 max_block_sizes[num] = std::max(max_block_sizes[num], block_size); 8 } 9 last_occurrence[num] = i; 10 }
Tail blocks are defined as blocks formed from the last occurrence of a number to the end of the vector. For each number, we calculate the size of its tail block and update max_block_sizes
if necessary.
C++1 for (const auto& entry : last_occurrence) { 2 int num = entry.first; 3 int pos = entry.second; 4 int block_size = vec.size() - pos - 1; 5 max_block_sizes[num] = std::max(max_block_sizes[num], block_size); 6 }
Finally, we find the number associated with the smallest maximum block size in max_block_sizes
, and return it.
C++1 int min_num = -1; 2 int min_block_size = std::numeric_limits<int>::max(); 3 for (const auto& entry : max_block_sizes) { 4 if (entry.second < min_block_size) { 5 min_block_size = entry.second; 6 min_num = entry.first; 7 } 8 } 9 10 return min_num; 11}
Putting it all together, here's the full code:
C++1#include <iostream> 2#include <vector> 3#include <unordered_map> 4#include <unordered_set> 5#include <limits> 6#include <algorithm> 7 8int minimal_max_block_bruteforce(const std::vector<int>& vec) { 9 int min_max_block_size = std::numeric_limits<int>::max(); 10 int min_num = -1; 11 12 std::unordered_set<int> unique_elements(vec.begin(), vec.end()); 13 14 for (int num : unique_elements) { 15 std::vector<int> indices; 16 for (size_t i = 0; i < vec.size(); ++i) { 17 if (vec[i] == num) { 18 indices.push_back(i); 19 } 20 } 21 indices.insert(indices.begin(), -1); 22 indices.push_back(vec.size()); 23 24 int max_block_size = 0; 25 for (size_t i = 1; i < indices.size(); ++i) { 26 max_block_size = std::max(max_block_size, indices[i] - indices[i - 1] - 1); 27 } 28 29 if (max_block_size < min_max_block_size) { 30 min_max_block_size = max_block_size; 31 min_num = num; 32 } 33 } 34 35 return min_num; 36} 37 38int minimal_max_block(const std::vector<int>& vec) { 39 std::unordered_map<int, int> last_occurrence; 40 std::unordered_map<int, int> max_block_sizes; 41 42 for (size_t i = 0; i < vec.size(); ++i) { 43 int num = vec[i]; 44 if (last_occurrence.find(num) == last_occurrence.end()) { 45 max_block_sizes[num] = i; 46 } else { 47 int block_size = i - last_occurrence[num] - 1; 48 max_block_sizes[num] = std::max(max_block_sizes[num], block_size); 49 } 50 last_occurrence[num] = i; 51 } 52 53 for (const auto& entry : last_occurrence) { 54 int num = entry.first; 55 int pos = entry.second; 56 int block_size = vec.size() - pos - 1; 57 max_block_sizes[num] = std::max(max_block_sizes[num], block_size); 58 } 59 60 int min_num = -1; 61 int min_block_size = std::numeric_limits<int>::max(); 62 for (const auto& entry : max_block_sizes) { 63 if (entry.second < min_block_size) { 64 min_block_size = entry.second; 65 min_num = entry.first; 66 } 67 } 68 69 return min_num; 70} 71 72int main() { 73 std::vector<int> vec = {1, 2, 2, 3, 1, 4, 4, 4, 1, 2, 5}; 74 std::cout << "Brute Force Result: " << minimal_max_block_bruteforce(vec) << std::endl; 75 std::cout << "Optimized Result: " << minimal_max_block(vec) << std::endl; 76 return 0; 77}
Excellent work! The lesson of this unit was quite comprehensive — we revisited the concept of std::unordered_map
, learned how they enhance the performance of code, and even constructed a function to locate a specific number in a vector that minimizes the maximum chunk size upon removal.
Now that we've mastered the basics, the logical next step is to apply your newfound knowledge. In the upcoming practice session, a variety of intriguing challenges await you that delve further into std::unordered_map
, vector manipulation, and innovative optimizations. So brace yourself, and let's dive in!