Lesson 3

Greetings, learners! Today's focus is **data aggregation**, a practical concept, featuring **HashMaps** as our principal tool in C++.

Data aggregation refers to the gathering of “raw” data and its subsequent presentation in an analysis-friendly format. A helpful analogy can be likened to viewing a cityscape from an airplane, which provides an informative aerial overview, rather than delving into the specifics of individual buildings. We'll introduce you to the `Sum`

, `Average`

, `Count`

, `Maximum`

, and `Minimum`

functions for practical, hands-on experience.

Let's dive in!

**Data aggregation** serves as an effective cornerstone of data analysis, enabling data synthesis and presentation in a more manageable and summarized format. Imagine identifying the total number of apples in a basket at a glance instead of counting each apple individually. With C++, such a feat can be achieved effortlessly, using grouping and summarizing functions, with `unordered_map`

being instrumental in this process.

Let's unveil how `unordered_map`

assists us in data aggregation. Picture a C++ `unordered_map`

wherein the keys signify different fruit types, and the values reflect their respective quantities. An `unordered_map`

could efficiently total all the quantities, providing insights into the `Sum`

, `Count`

, `Max`

, `Min`

, and `Average`

operations.

Let's delve into a hands-on example using a fruit basket represented as an `unordered_map`

:

C++`1#include <iostream> 2#include <unordered_map> 3#include <vector> 4 5int main() { 6 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 7 // An unordered_map representing our fruit basket 8 9 // Summing the values in the unordered_map 10 int total_fruits = 0; 11 for (const auto& pair : fruit_basket) { 12 total_fruits += pair.second; 13 } 14 15 std::cout << "The total number of fruits in the basket is: " << total_fruits << std::endl; 16 // It outputs: "The total number of fruits in the basket is: 17" 17 18 return 0; 19}`

Just as easily, we can count the number of fruit types in our basket, which corresponds to the number of keys in our `unordered_map`

.

C++`1#include <iostream> 2#include <unordered_map> 3 4int main() { 5 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 6 // An unordered_map representing our fruit basket 7 8 // Counting the elements in the unordered_map 9 int count_fruits = fruit_basket.size(); 10 std::cout << "The number of fruit types in the basket is: " << count_fruits << std::endl; 11 // It outputs: "The number of fruit types in the basket is: 3" 12 13 return 0; 14}`

C++ does not have built-in functions like `max`

and `min`

to find the highest and lowest values directly in an `unordered_map`

. Instead, we use the standard library functions `max_element`

and `min_element`

alongside lambda functions to define custom comparison logic.

The `max_element`

and `min_element`

functions are part of the `<algorithm>`

library. They are used to find the largest and smallest elements in a range, respectively. In the context of an `unordered_map`

, these functions can help us find the key-value pair with the highest and lowest values.

C++`1#include <iostream> 2#include <algorithm> 3#include <unordered_map> 4#include <limits> 5 6int main() { 7 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 8 // An unordered_map representing our fruit basket 9 10 // Finding the maximum value 11 auto max_fruit = std::max_element(fruit_basket.begin(), fruit_basket.end(), 12 [](const auto& a, const auto& b) { 13 return a.second < b.second; 14 })->first; 15 16 std::cout << "The fruit with the most quantity is: " << max_fruit << std::endl; 17 // It outputs: "The fruit with the most quantity is: oranges" 18 19 // Finding the minimum value 20 auto min_fruit = std::min_element(fruit_basket.begin(), fruit_basket.end(), 21 [](const auto& a, const auto& b) { 22 return a.second < b.second; 23 })->first; 24 25 std::cout << "The fruit with the least quantity is: " << min_fruit << std::endl; 26 // It outputs: "The fruit with the least quantity is: bananas" 27 28 return 0; 29}`

In the examples above, lambda functions are used as the third argument in both `max_element`

and `min_element`

. A lambda function is an anonymous function defined with the syntax `[]() {}`

.

Here is a breakdown of the lambda function used:

C++`1[](const auto& a, const auto& b) { 2 return a.second < b.second; 3}`

`[](const auto& a, const auto& b) { ... }`

: This part declares a lambda function that takes two parameters,`a`

and`b`

, both representing key-value pairs from the`unordered_map`

.`return a.second < b.second;`

: The lambda function compares the`second`

element (the value) of the two key-value pairs. It returns`true`

if the value of`a`

is less than the value of`b`

, helping`max_element`

and`min_element`

determine which element is larger or smaller, respectively.

`->first`

: The`->first`

at the end of`max_element`

and`min_element`

indicates that we are interested in the`first`

element (the key) of the key-value pair returned by these functions.`->second`

: Similarly,`->second`

would be used if we needed to access the value part of the key-value pair. In our lambda function,`a.second`

and`b.second`

refer to the values associated with the keys`a`

and`b`

in the`unordered_map`

.

By using `max_element`

and `min_element`

with the lambda function, we efficiently find the key associated with the maximum or minimum value in the `unordered_map`

.

Similar to finding the total quantity of fruits, we can calculate the average number of each type using the `size()`

and summing the values in the `unordered_map`

. Here, we divide the total quantity of fruits by the number of fruit types to determine the average.

C++`1#include <iostream> 2#include <unordered_map> 3#include <vector> 4 5int main() { 6 std::unordered_map<std::string, int> fruit_basket = {{"apples", 5}, {"bananas", 4}, {"oranges", 8}}; 7 // An unordered_map representing our fruit basket 8 9 // Summing the values 10 int total_fruits = 0; 11 for (const auto& pair : fruit_basket) { 12 total_fruits += pair.second; 13 } 14 15 // Calculating the average 16 double average_fruits = static_cast<double>(total_fruits) / fruit_basket.size(); 17 std::cout << "The average number of each type of fruit in the basket is: " << average_fruits << std::endl; 18 // It outputs: "The average number of each type of fruit in the basket is: 5.67" 19 20 return 0; 21}`

Congratulations on learning about data aggregation! You've mastered `Sum`

, `Count`

, `Max`

, `Min`

, and `Average`

operations, thus enhancing your knowledge base for real-world applications.

The skills you've acquired in data aggregation using `unordered_map`

are invaluable across a vast array of data analysis tasks, such as report generation or decision-making processes. Up next are insightful practice exercises that will solidify today's understanding. See you then! Happy coding!