Lesson 4

Welcome to today's lesson! Our topic for the day is **data aggregation**, a crucial aspect of data analysis. Like summarizing a massive book into key points, **data aggregation** summarizes large amounts of data into important highlights.

By the end of today, you'll be equipped with several aggregation methods to summarize data streams in `C++`

. Let's get started!

Let's say we have an array of integers denoting the ages of a group of people. We will demonstrate several basic aggregation methods using C++'s built-in functions:

C++`1#include <iostream> 2#include <vector> 3#include <numeric> // For std::accumulate 4#include <algorithm> // For std::min_element, std::max_element 5 6int main() { 7 std::vector<int> ages = {21, 23, 20, 25, 22, 27, 24, 22, 25, 22, 23, 22}; 8 9 // Number of people 10 int num_people = ages.size(); 11 std::cout << "Number of people: " << num_people << std::endl; 12 13 // Total age 14 int total_ages = std::accumulate(ages.begin(), ages.end(), 0); 15 std::cout << "Total age: " << total_ages << std::endl; 16 17 // Youngest age 18 int youngest_age = *std::min_element(ages.begin(), ages.end()); 19 std::cout << "Youngest age: " << youngest_age << std::endl; 20 21 // Oldest age 22 int oldest_age = *std::max_element(ages.begin(), ages.end()); 23 std::cout << "Oldest age: " << oldest_age << std::endl; 24 25 // Average age 26 double average_age = static_cast<double>(total_ages) / num_people; 27 std::cout << "Average age: " << average_age << std::endl; 28 29 // Age range 30 int age_range = oldest_age - youngest_age; 31 std::cout << "Age range: " << age_range << std::endl; 32 33 return 0; 34}`

Here's a brief overview of the above code snippet:

**Number of people**: Uses`ages.size()`

to get the number of elements in the vector.**Total age**: Uses`std::accumulate`

from the`<numeric>`

header to sum all elements in the vector.**Youngest age**: Uses`std::min_element`

from the`<algorithm>`

header to find the smallest element.**Oldest age**: Uses`std::max_element`

from the`<algorithm>`

header to find the largest element.**Average age**: Calculates the average age by dividing the total age by the number of people.**Age range**: Computes the range of ages by subtracting the youngest age from the oldest age.

These functions provide essential aggregation operations and are widely used with data streams.

For deeper analysis, such as calculating the mode or most frequent age, we can use `for`

and `while`

loops.

For example, using `for`

loops, we can find the mode or most frequent age:

C++`1#include <iostream> 2#include <vector> 3#include <map> 4 5int main() { 6 std::vector<int> ages = {21, 23, 20, 25, 22, 27, 24, 22, 25, 22, 23, 22}; 7 8 // Initialize a map to store the frequency of each age 9 std::map<int, int> frequencies; 10 11 // Use a for loop to populate frequencies 12 for (int age : ages) { 13 frequencies[age]++; 14 } 15 16 // Find the age with max frequency 17 int max_freq = 0; 18 int mode_age = -1; 19 for (const auto& pair : frequencies) { 20 if (pair.second > max_freq) { 21 max_freq = pair.second; 22 mode_age = pair.first; 23 } 24 } 25 std::cout << "Max frequency: " << max_freq << std::endl; // Max frequency: 4 26 std::cout << "Mode age: " << mode_age << std::endl; // Mode age: 22 27 28 return 0; 29}`

`while`

loops can also be used similarly for complex tasks.

Fantastic! You've just learned how to use basic and advanced data aggregation methods in `C++`

. These techniques are pivotal in data analysis and understanding. Now, get ready for the practical tasks lined up next. They'll reinforce the skills you've just gained. Remember, the more you practice, the better you become. Good luck with your practice!