Lesson 3
Data Aggregation Using HashMaps in C++
Topic Overview

Greetings, learners! Today's focus is data aggregation, a practical concept, featuring HashMaps as our principal tool in C++.

Data aggregation refers to the gathering of “raw” data and its subsequent presentation in an analysis-friendly format. A helpful analogy can be likened to viewing a cityscape from an airplane, which provides an informative aerial overview, rather than delving into the specifics of individual buildings. We'll introduce you to the Sum, Average, Count, Maximum, and Minimum functions for practical, hands-on experience.

Let's dive in!

Understand Aggregation

Data aggregation serves as an effective cornerstone of data analysis, enabling data synthesis and presentation in a more manageable and summarized format. Imagine identifying the total number of apples in a basket at a glance instead of counting each apple individually. With C++, such a feat can be achieved effortlessly, using grouping and summarizing functions, with unordered_map being instrumental in this process.

Data Aggregation Using HashMaps

Let's unveil how unordered_map assists us in data aggregation. Picture a C++ unordered_map wherein the keys signify different fruit types, and the values reflect their respective quantities. An unordered_map could efficiently total all the quantities, providing insights into the Sum, Count, Max, Min, and Average operations.

Practice: Summing Values in a HashMap

Let's delve into a hands-on example using a fruit basket represented as an unordered_map:

C++
1#include <iostream> 2#include <unordered_map> 3#include <vector> 4 5int main() { 6 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 7 // An unordered_map representing our fruit basket 8 9 // Summing the values in the unordered_map 10 int total_fruits = 0; 11 for (const auto& pair : fruit_basket) { 12 total_fruits += pair.second; 13 } 14 15 std::cout << "The total number of fruits in the basket is: " << total_fruits << std::endl; 16 // It outputs: "The total number of fruits in the basket is: 17" 17 18 return 0; 19}
Practice: Counting Elements in a HashMap

Just as easily, we can count the number of fruit types in our basket, which corresponds to the number of keys in our unordered_map.

C++
1#include <iostream> 2#include <unordered_map> 3 4int main() { 5 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 6 // An unordered_map representing our fruit basket 7 8 // Counting the elements in the unordered_map 9 int count_fruits = fruit_basket.size(); 10 std::cout << "The number of fruit types in the basket is: " << count_fruits << std::endl; 11 // It outputs: "The number of fruit types in the basket is: 3" 12 13 return 0; 14}
Practice: Maximum and Minimum Values in a HashMap

C++ does not have built-in functions like max and min to find the highest and lowest values directly in an unordered_map. Instead, we use the standard library functions max_element and min_element alongside lambda functions to define custom comparison logic.

max_element and min_element

The max_element and min_element functions are part of the <algorithm> library. They are used to find the largest and smallest elements in a range, respectively. In the context of an unordered_map, these functions can help us find the key-value pair with the highest and lowest values.

C++
1#include <iostream> 2#include <algorithm> 3#include <unordered_map> 4#include <limits> 5 6int main() { 7 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 8 // An unordered_map representing our fruit basket 9 10 // Finding the maximum value 11 auto max_fruit = std::max_element(fruit_basket.begin(), fruit_basket.end(), 12 [](const auto& a, const auto& b) { 13 return a.second < b.second; 14 })->first; 15 16 std::cout << "The fruit with the most quantity is: " << max_fruit << std::endl; 17 // It outputs: "The fruit with the most quantity is: oranges" 18 19 // Finding the minimum value 20 auto min_fruit = std::min_element(fruit_basket.begin(), fruit_basket.end(), 21 [](const auto& a, const auto& b) { 22 return a.second < b.second; 23 })->first; 24 25 std::cout << "The fruit with the least quantity is: " << min_fruit << std::endl; 26 // It outputs: "The fruit with the least quantity is: bananas" 27 28 return 0; 29}
Lambda Functions

In the examples above, lambda functions are used as the third argument in both max_element and min_element. A lambda function is an anonymous function defined with the syntax []() {}.

Here is a breakdown of the lambda function used:

C++
1[](const auto& a, const auto& b) { 2 return a.second < b.second; 3}
  • [](const auto& a, const auto& b) { ... }: This part declares a lambda function that takes two parameters, a and b, both representing key-value pairs from the unordered_map.
  • return a.second < b.second;: The lambda function compares the second element (the value) of the two key-value pairs. It returns true if the value of a is less than the value of b, helping max_element and min_element determine which element is larger or smaller, respectively.
Accessing Elements
  • ->first: The ->first at the end of max_element and min_element indicates that we are interested in the first element (the key) of the key-value pair returned by these functions.
  • ->second: Similarly, ->second would be used if we needed to access the value part of the key-value pair. In our lambda function, a.second and b.second refer to the values associated with the keys a and b in the unordered_map.

By using max_element and min_element with the lambda function, we efficiently find the key associated with the maximum or minimum value in the unordered_map.

Practice: Averaging Values in a HashMap

Similar to finding the total quantity of fruits, we can calculate the average number of each type using the size() and summing the values in the unordered_map. Here, we divide the total quantity of fruits by the number of fruit types to determine the average.

C++
1#include <iostream> 2#include <unordered_map> 3#include <vector> 4 5int main() { 6 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 7 // An unordered_map representing our fruit basket 8 9 // Summing the values 10 int total_fruits = 0; 11 for (const auto& pair : fruit_basket) { 12 total_fruits += pair.second; 13 } 14 15 // Calculating the average 16 double average_fruits = static_cast<double>(total_fruits) / fruit_basket.size(); 17 std::cout << "The average number of each type of fruit in the basket is: " << average_fruits << std::endl; 18 // It outputs: "The average number of each type of fruit in the basket is: 5.67" 19 20 return 0; 21}
Lesson Summary and Practice

Congratulations on learning about data aggregation! You've mastered Sum, Count, Max, Min, and Average operations, thus enhancing your knowledge base for real-world applications.

The skills you've acquired in data aggregation using unordered_map are invaluable across a vast array of data analysis tasks, such as report generation or decision-making processes. Up next are insightful practice exercises that will solidify today's understanding. See you then! Happy coding!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.