Lesson 2
Applying Data Filtering and Aggregation in C++ with an User Management System
Introduction

Welcome to today's lesson on applying data filtering and aggregation in a real-world scenario using a user management system. We'll start by building a foundational structure that can handle basic user operations. Then, we'll expand it by introducing more advanced functionalities that allow filtering and aggregating user data.

Starter Task Methods

In our starter task, we will implement a class that manages basic operations on a collection of user data, specifically handling adding new users, retrieving user profiles, and updating user profiles.

Here are the starter task methods:

  • bool add_user(std::string user_id, int age, std::string country, bool subscribed) - adds a new user with the specified attributes. Returns true if the user was added successfully and false if a user with the same user_id already exists.
  • std::optional<UserProfile> get_user(std::string user_id) - returns the user's profile if the user exists; otherwise, returns std::nullopt.
  • bool update_user(std::string user_id, std::optional<int> age, std::optional<std::string> country, std::optional<bool> subscribed) - updates the user's profile based on non-optional parameters. Returns true if the user exists and was updated, false otherwise.

To store the user data, we will define a UserProfile custom struct.

Starter Task Implementation

Here is the implementation of our starter task in C++:

C++
1#include <iostream> 2#include <string> 3#include <map> 4#include <optional> 5 6struct UserProfile { 7 int age; 8 std::string country; 9 bool subscribed; 10}; 11 12class UserManager { 13public: 14 bool add_user(const std::string& user_id, int age, const std::string& country, bool subscribed) { 15 if (users.find(user_id) != users.end()) { 16 return false; 17 } 18 users[user_id] = UserProfile{age, country, subscribed}; 19 return true; 20 } 21 22 std::optional<UserProfile> get_user(const std::string& user_id) { 23 if (users.find(user_id) != users.end()) { 24 return users[user_id]; 25 } 26 return std::nullopt; 27 } 28 29 bool update_user(const std::string& user_id, std::optional<int> age, std::optional<std::string> country, std::optional<bool> subscribed) { 30 if (users.find(user_id) == users.end()) { 31 return false; 32 } 33 if (age) { 34 users[user_id].age = age.value(); 35 } 36 if (country) { 37 users[user_id].country = country.value(); 38 } 39 if (subscribed) { 40 users[user_id].subscribed = subscribed.value(); 41 } 42 return true; 43 } 44 45private: 46 std::map<std::string, UserProfile> users; 47}; 48 49// Example usage 50int main() { 51 UserManager um; 52 std::cout << std::boolalpha; 53 std::cout << um.add_user("u1", 25, "USA", true) << std::endl; // true 54 std::cout << um.add_user("u2", 30, "Canada", false) << std::endl; // true 55 std::cout << um.add_user("u1", 22, "Mexico", true) << std::endl; // false 56 57 if (auto user = um.get_user("u1")) { 58 std::cout << user->age << std::endl; // 25 59 } 60 61 std::cout << um.update_user("u1", 26, std::nullopt, std::nullopt) << std::endl; // true 62 std::cout << um.update_user("u3", 19, "UK", false) << std::endl; // false 63 64 return 0; 65}

This implementation covers all our starter methods. Let's move forward and introduce more complex functionalities.

Introducing New Methods for Data Filtering and Aggregation

With our foundational structure in place, it's time to add functionalities for filtering user data and aggregating statistics.

Here are the new methods to implement:

  • std::vector<std::string> filter_users(std::optional<int> min_age, std::optional<int> max_age, std::optional<std::string> country, std::optional<bool> subscribed):
    • Returns the list of user IDs that match the specified criteria. Criteria can be std::nullopt, meaning that criterion should not be applied during filtering.
  • std::map<std::string, float> aggregate_stats() - returns statistics in the form of a map:
    • total_users: Total number of users (as a float)
    • average_age: Average age of all users (rounded down to the nearest integer)
    • subscribed_ratio: Ratio of subscribed users to total users (as a float with two decimals)
Step 1: Adding `filter_users` Method

This method filters users based on the criteria provided. Let's see how it works:

C++
1#include <vector> 2#include <algorithm> 3 4class UserManager { 5public: 6 // Existing methods... 7 8 std::vector<std::string> filter_users(std::optional<int> min_age, std::optional<int> max_age, std::optional<std::string> country, std::optional<bool> subscribed) { 9 std::vector<std::string> filtered_users; 10 for (const auto& [user_id, profile] : users) { 11 if (min_age && profile.age < min_age) { 12 continue; 13 } 14 if (max_age && profile.age > max_age) { 15 continue; 16 } 17 if (country && profile.country != country) { 18 continue; 19 } 20 if (subscribed && profile.subscribed != subscribed) { 21 continue; 22 } 23 filtered_users.push_back(user_id); 24 } 25 return filtered_users; 26 } 27 28private: 29 std::map<std::string, UserProfile> users; 30}; 31 32// Example usage of the new method 33int main() { 34 UserManager um; 35 um.add_user("u1", 25, "USA", true); 36 um.add_user("u2", 30, "Canada", false); 37 um.add_user("u3", 22, "USA", true); 38 39 auto result1 = um.filter_users(20, 30, "USA", true); 40 for (const auto& user_id : result1) { 41 std::cout << user_id << " "; // u1 u3 42 } 43 std::cout << std::endl; 44 45 auto result2 = um.filter_users(std::nullopt, 28, std::nullopt, std::nullopt); 46 for (const auto& user_id : result2) { 47 std::cout << user_id << " "; // u1 u3 48 } 49 std::cout << std::endl; 50 51 auto result3 = um.filter_users(std::nullopt, std::nullopt, "Canada", false); 52 for (const auto& user_id : result3) { 53 std::cout << user_id << " "; // u2 54 } 55 std::cout << std::endl; 56 57 return 0; 58}
  • The filter_users method filters users based on min_age, max_age, country, and subscribed status criteria.
  • It iterates over the users map and checks each user's profile against the provided criteria.
  • Users who meet all the criteria are added to the filtered_users list, which is then returned.
  • The example usage shows how to add users and filter them based on different criteria.
Step 2: Adding `aggregate_stats` Method

This method aggregates statistics from the user profiles. Let's implement it:

C++
1#include <numeric> 2#include <cmath> 3 4class UserManager { 5public: 6 // Existing methods... 7 8 std::map<std::string, float> aggregate_stats() { 9 float total_users = static_cast<float>(users.size()); 10 if (total_users == 0) { 11 return {{"total_users", 0.0f}, {"average_age", 0.0f}, {"subscribed_ratio", 0.00f}}; 12 } 13 14 float total_age = std::accumulate(users.begin(), users.end(), 0.0f, [](float sum, const auto& pair) { 15 return sum + pair.second.age; 16 }); 17 float subscribed_users = static_cast<float>(std::count_if(users.begin(), users.end(), [](const auto& pair) { 18 return pair.second.subscribed; 19 })); 20 21 float average_age = std::floor(total_age / total_users); 22 float subscribed_ratio = std::round((subscribed_users / total_users) * 100) / 100; 23 24 return {{"total_users", total_users}, {"average_age", average_age}, {"subscribed_ratio", subscribed_ratio}}; 25 } 26}; 27 28// Using `um` from the previous section 29int main() { 30 UserManager um; 31 um.add_user("u1", 25, "USA", true); 32 um.add_user("u2", 30, "Canada", false); 33 um.add_user("u3", 22, "USA", true); 34 35 auto stats = um.aggregate_stats(); 36 std::cout << "Total users: " << stats["total_users"] << std::endl; // 3 37 std::cout << "Average age: " << stats["average_age"] << std::endl; // 25 38 std::cout << "Subscribed ratio: " << stats["subscribed_ratio"] << std::endl; // 0.67 39 40 return 0; 41}

This aggregate_stats method calculates and returns aggregate statistics about the users in the form of a map. It first determines total_users, the total number of users. If there are no users, it returns a dictionary with zeroed statistics. Otherwise, it calculates total_age by summing the ages of all users and counts subscribed_users who are subscribed. It then computes average_age by performing integer division of total_age by total_users and calculates subscribed_ratio by dividing subscribed_users by total_users and rounding to two decimal places. The resulting statistics dictionary includes total_users, average_age, and subscribed_ratio.

The Final Solution

Here's the complete UserManager class with all methods, including the new ones for filtering and aggregation:

C++
1#include <iostream> 2#include <string> 3#include <map> 4#include <optional> 5#include <vector> 6#include <algorithm> 7#include <numeric> 8#include <cmath> 9 10struct UserProfile { 11 int age; 12 std::string country; 13 bool subscribed; 14}; 15 16class UserManager { 17public: 18 bool add_user(const std::string& user_id, int age, const std::string& country, bool subscribed) { 19 if (users.find(user_id) != users.end()) { 20 return false; 21 } 22 users[user_id] = UserProfile{age, country, subscribed}; 23 return true; 24 } 25 26 std::optional<UserProfile> get_user(const std::string& user_id) { 27 if (users.find(user_id) != users.end()) { 28 return users[user_id]; 29 } 30 return std::nullopt; 31 } 32 33 bool update_user(const std::string& user_id, std::optional<int> age, std::optional<std::string> country, std::optional<bool> subscribed) { 34 if (users.find(user_id) == users.end()) { 35 return false; 36 } 37 if (age) { 38 users[user_id].age = age.value(); 39 } 40 if (country) { 41 users[user_id].country = country.value(); 42 } 43 if (subscribed) { 44 users[user_id].subscribed = subscribed.value(); 45 } 46 return true; 47 } 48 49 std::vector<std::string> filter_users(std::optional<int> min_age, std::optional<int> max_age, std::optional<std::string> country, std::optional<bool> subscribed) { 50 std::vector<std::string> filtered_users; 51 for (const auto& [user_id, profile] : users) { 52 if (min_age && profile.age < min_age) { 53 continue; 54 } 55 if (max_age && profile.age > max_age) { 56 continue; 57 } 58 if (country && profile.country != country) { 59 continue; 60 } 61 if (subscribed && profile.subscribed != subscribed) { 62 continue; 63 } 64 filtered_users.push_back(user_id); 65 } 66 return filtered_users; 67 } 68 69 std::map<std::string, float> aggregate_stats() { 70 float total_users = static_cast<float>(users.size()); 71 if (total_users == 0) { 72 return {{"total_users", 0.0f}, {"average_age", 0.0f}, {"subscribed_ratio", 0.00f}}; 73 } 74 75 float total_age = std::accumulate(users.begin(), users.end(), 0.0f, [](float sum, const auto& pair) { 76 return sum + pair.second.age; 77 }); 78 float subscribed_users = static_cast<float>(std::count_if(users.begin(), users.end(), [](const auto& pair) { 79 return pair.second.subscribed; 80 })); 81 82 float average_age = std::floor(total_age / total_users); 83 float subscribed_ratio = std::round((subscribed_users / total_users) * 100) / 100; 84 85 return {{"total_users", total_users}, {"average_age", average_age}, {"subscribed_ratio", subscribed_ratio}}; 86 } 87 88private: 89 std::map<std::string, UserProfile> users; 90}; 91 92// Example usage 93int main() { 94 UserManager um; 95 um.add_user("u1", 25, "USA", true); 96 um.add_user("u2", 30, "Canada", false); 97 um.add_user("u3", 22, "USA", true); 98 99 auto result1 = um.filter_users(20, 30, "USA", true); 100 for (const auto& user_id : result1) { 101 std::cout << user_id << " "; // u1 u3 102 } 103 std::cout << std::endl; 104 105 auto result2 = um.filter_users(std::nullopt, 28, std::nullopt, std::nullopt); 106 for (const auto& user_id : result2) { 107 std::cout << user_id << " "; // u1 u3 108 } 109 std::cout << std::endl; 110 111 auto result3 = um.filter_users(std::nullopt, std::nullopt, "Canada", false); 112 for (const auto& user_id : result3) { 113 std::cout << user_id << " "; // u2 114 } 115 std::cout << std::endl; 116 117 auto stats = um.aggregate_stats(); 118 std::cout << "Total users: " << stats["total_users"] << std::endl; // 3 119 std::cout << "Average age: " << stats["average_age"] << std::endl; // 25 120 std::cout << "Subscribed ratio: " << stats["subscribed_ratio"] << std::endl; // 0.67 121 122 return 0; 123}
Lesson Summary

Great job! Today, you've learned how to effectively handle user data by implementing advanced functionalities like filtering and aggregation on top of a basic system. This is a critical skill in real-life software development, where you often need to extend existing systems to meet new requirements.

I encourage you to practice solving similar challenges to solidify your understanding of data filtering and aggregation. Happy coding, and see you in the next lesson!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.