Lesson 2
Diving Into Filtering Data Streams in C++
Diving Into Filtering Data Streams in C++

Welcome to our hands-on tutorial on data filtering in C++. In this session, we spotlight data filtering, a simple yet powerful aspect of programming and data manipulation. By learning to filter data, we can extract only the pieces of data that meet specific standards, decluttering the mess of unwanted data.

Grasping the Concept of Filtering

In the real world, data filtering mirrors the process of sieving. Let's visualize this. Imagine you're shopping online for a shirt. You have the ability to filter clothes based on color, size, brand, etc. Translating this to programming, our clothing items are our data, and our sieve is a selection of Boolean logic and algorithms used for filtering.

Discovering Data Filtering using Loops

In programming, loops enable coders to execute a block of code repetitively, making them handy tools in data filtering. C++ uses for and while loops that iterate through data structures like arrays or vectors, checking each data element against specific criteria.

For instance, let's build a class, DataFilter, that filters out numbers less than ten in a vector:

C++
1#include <vector> 2#include <iostream> 3 4class DataFilter { 5public: 6 std::vector<int> filterWithLoops(const std::vector<int> &dataStream) { 7 std::vector<int> filteredData; 8 for (int item : dataStream) { 9 if (item < 10) { 10 filteredData.push_back(item); 11 } 12 } 13 return filteredData; 14 } 15}; 16 17int main() { 18 std::vector<int> dataStream = {23, 5, 7, 12, 19, 2}; 19 DataFilter df; 20 21 std::vector<int> filteredData = df.filterWithLoops(dataStream); 22 std::cout << "Filtered data by loops:"; 23 for (int item : filteredData) { 24 std::cout << " " << item; 25 } 26 std::cout << std::endl; 27 // Output: Filtered data by loops: 5 7 2 28 29 return 0; 30}

Notice the for loop combined with a conditional if statement to filter out numbers less than ten and push them into filteredData.

Decoding Data Filtering with the `std::copy_if` Function

C++ incorporates the Standard Template Library (STL), which provides algorithms specifically designed for various operations. One such algorithm is std::copy_if, which helps in copying elements based on a condition.

Let’s use std::copy_if to refactor our filter function:

C++
1#include <vector> 2#include <iostream> 3#include <algorithm> 4 5class DataFilter { 6public: 7 std::vector<int> filterWithCopyIf(const std::vector<int> &dataStream) { 8 std::vector<int> filteredData; 9 std::copy_if(dataStream.begin(), dataStream.end(), std::back_inserter(filteredData), [](int item) { 10 return item < 10; 11 }); 12 return filteredData; 13 } 14}; 15 16int main() { 17 std::vector<int> dataStream = {23, 5, 7, 12, 19, 2}; 18 DataFilter df; 19 20 std::vector<int> filteredData = df.filterWithCopyIf(dataStream); 21 std::cout << "Filtered data by std::copy_if:"; 22 for (int item : filteredData) { 23 std::cout << " " << item; 24 } 25 std::cout << std::endl; 26 // Output: Filtered data by std::copy_if: 5 7 2 27 return 0; 28}

In the above example, std::copy_if is used to filter data in a more succinct and effective manner, with a lambda function checking if an item is less than ten.

Bundling Data Filtering Methods into a Class and Extending Functionality

We have showcased C++ techniques for data filtering in the DataFilter class. Now, let's extend the functionality of our DataFilter class to handle different filtering criteria. We'll add more methods to demonstrate its versatility and reusability:

C++
1#include <vector> 2#include <iostream> 3#include <algorithm> 4 5class DataFilter { 6public: 7 std::vector<int> filterWithLoops(const std::vector<int> &dataStream) { 8 std::vector<int> filteredData; 9 for (int item : dataStream) { 10 if (item < 10) { 11 filteredData.push_back(item); 12 } 13 } 14 return filteredData; 15 } 16 17 std::vector<int> filterWithCopyIf(const std::vector<int> &dataStream) { 18 std::vector<int> filteredData; 19 std::copy_if(dataStream.begin(), dataStream.end(), std::back_inserter(filteredData), [](int item) { 20 return item < 10; 21 }); 22 return filteredData; 23 } 24 25 // Filter based on a custom predicate 26 std::vector<int> filterByPredicate(const std::vector<int> &dataStream, std::function<bool(int)> predicate) { 27 std::vector<int> filteredData; 28 std::copy_if(dataStream.begin(), dataStream.end(), std::back_inserter(filteredData), predicate); 29 return filteredData; 30 } 31}; 32 33int main() { 34 std::vector<int> dataStream = {23, 5, 7, 12, 19, 2}; 35 DataFilter df; 36 37 // Filtering using loops 38 std::vector<int> filteredData = df.filterWithLoops(dataStream); 39 std::cout << "Filtered data by loops:"; 40 for (int item : filteredData) { 41 std::cout << " " << item; 42 } 43 std::cout << std::endl; 44 45 // Filtering using std::copy_if 46 filteredData = df.filterWithCopyIf(dataStream); 47 std::cout << "Filtered data by std::copy_if:"; 48 for (int item : filteredData) { 49 std::cout << " " << item; 50 } 51 std::cout << std::endl; 52 53 // Filtering using a custom predicate (e.g., numbers greater than 10) 54 filteredData = df.filterByPredicate(dataStream, [](int item) { 55 return item > 10; 56 }); 57 std::cout << "Filtered data by custom predicate (greater than 10):"; 58 for (int item : filteredData) { 59 std::cout << " " << item; 60 } 61 std::cout << std::endl; 62 63 return 0; 64}

By adding new methods such as filterByPredicate, we've shown how to extend the DataFilter class to handle different types of filtering criteria. This enhances the usability and flexibility of the class, making it a more valuable tool for various data filtering scenarios.

Lesson Summary

Bravo! Today, we have ventured through the ins and outs of data filtering, spanning loops and the std::copy_if algorithm from the C++ Standard Template Library. Now gear up for some exciting practice sessions, the key to honing your new skills in C++. Happy coding!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.