Lesson 1
Understanding Data Streams: Basics and Operations
Introduction: Understanding Data Streams

Warm greetings! This lesson introduces data streams, which are essentially continuous datasets. Think of a weather station or a gaming application gathering data per second — both are generating data streams! We will master handling these data streams using C++, learning to access elements, slice segments, and even convert these streams into strings for easier handling.

Representing Data Streams in C++

In C++, data streams can be represented using vectors and maps from the STL (Standard Template Library). Let’s consider a straightforward C++ class named DataStream. This class encapsulates operations related to data streams in our program:

C++
1#include <iostream> 2#include <vector> 3#include <map> 4#include <string> 5 6class DataStream { 7public: 8 DataStream(const std::vector<std::map<std::string, int>>& data) : data(data) {} 9 10private: 11 std::vector<std::map<std::string, int>> data; 12};

To use it, we create a sample data stream as an instance of our DataStream class, where each element is a map:

C++
1std::vector<std::map<std::string, int>> data = { 2 {{"id", 1}, {"value", 100}}, 3 {{"id", 2}, {"value", 200}}, 4 {{"id", 3}, {"value", 300}}, 5 {{"id", 4}, {"value", 400}}, 6}; 7DataStream stream(data);
Accessing Elements - Key Operation

To look into individual elements of a data stream, we use indexing. The get() method we introduce below fetches the i+1-th element from the data stream:

C++
1#include <stdexcept> // For std::out_of_range 2 3class DataStream { 4public: 5 DataStream(const std::vector<std::map<std::string, int>>& data) : data(data) {} 6 7 std::map<std::string, int> get(int i) const { 8 if (i < 0) { 9 i += data.size(); 10 } 11 if (i >= 0 && i < data.size()) { 12 return data[i]; 13 } else { 14 throw std::out_of_range("Index out of range"); 15 } 16 } 17 18private: 19 std::vector<std::map<std::string, int>> data; 20};

Here, we can see the get() method in action:

C++
1int main() { 2 std::vector<std::map<std::string, int>> data = { 3 {{"id", 1}, {"value", 100}}, 4 {{"id", 2}, {"value", 200}}, 5 {{"id", 3}, {"value", 300}}, 6 {{"id", 4}, {"value", 400}}, 7 }; 8 DataStream stream(data); 9 10 try { 11 std::map<std::string, int> elem = stream.get(2); 12 std::cout << "id: " << elem["id"] << ", value: " << elem["value"] << std::endl; 13 14 elem = stream.get(-1); 15 std::cout << "id: " << elem["id"] << ", value: " << elem["value"] << std::endl; 16 } catch (const std::out_of_range& e) { 17 std::cerr << e.what() << std::endl; 18 } 19}

In essence, stream.get(2) fetched us {"id": 3, "value": 300} — the third element (since indexing starts from 0). At the same time, stream.get(-1) fetches the last element, which is {"id": 4, "value": 400}.

The code snippet if (i < 0) { i += data.size(); } is particularly relevant here. It allows negative indexing by converting a negative index into the corresponding positive index, effectively letting -1 refer to the last element, -2 to the second last, and so on. This approach is useful for accessing elements from the end of the data stream efficiently.

Slicing - A Useful Technique

Fetching a range of elements rather than a single one is facilitated by slicing. We introduce a slice() method to support slicing:

C++
1class DataStream { 2public: 3 DataStream(const std::vector<std::map<std::string, int>>& data) : data(data) {} 4 5 std::map<std::string, int> get(int i) const { 6 if (i < 0) { 7 i += data.size(); 8 } 9 if (i >= 0 && i < data.size()) { 10 return data[i]; 11 } else { 12 throw std::out_of_range("Index out of range"); 13 } 14 } 15 16 std::vector<std::map<std::string, int>> slice(int i, int j) const { 17 if (i < 0) { 18 i += data.size(); 19 } 20 if (j < 0) { 21 j += data.size(); 22 } 23 if (i >= 0 && j <= data.size() && i < j) { 24 return std::vector<std::map<std::string, int>>(data.begin() + i, data.begin() + j); 25 } else { 26 throw std::out_of_range("Slice indices out of range"); 27 } 28 } 29 30private: 31 std::vector<std::map<std::string, int>> data; 32};

Here's a quick usage example:

C++
1int main() { 2 std::vector<std::map<std::string, int>> data = { 3 {{"id", 1}, {"value", 100}}, 4 {{"id", 2}, {"value", 200}}, 5 {{"id", 3}, {"value", 300}}, 6 {{"id", 4}, {"value", 400}}, 7 }; 8 DataStream stream(data); 9 10 try { 11 std::vector<std::map<std::string, int>> slice = stream.slice(1, 3); 12 for (const auto& elem : slice) { 13 std::cout << "id: " << elem.at("id") << ", value: " << elem.at("value") << std::endl; 14 } 15 } catch (const std::out_of_range& e) { 16 std::cerr << e.what() << std::endl; 17 } 18}
Transforming Data Streams to Strings - Another Key Operation

For better readability, we may wish to convert our data streams into strings. To ensure the conversion works consistently, we will create a custom string representation for our data elements. Have a look at the to_string method in action:

C++
1#include <sstream> // For std::stringstream 2 3class DataStream { 4public: 5 DataStream(const std::vector<std::map<std::string, int>>& data) : data(data) {} 6 7 std::map<std::string, int> get(int i) const { 8 if (i < 0) { 9 i += data.size(); 10 } 11 if (i >= 0 && i < data.size()) { 12 return data[i]; 13 } else { 14 throw std::out_of_range("Index out of range"); 15 } 16 } 17 18 std::vector<std::map<std::string, int>> slice(int i, int j) const { 19 if (i < 0) { 20 i += data.size(); 21 } 22 if (j < 0) { 23 j += data.size(); 24 } 25 if (i >= 0 && j <= data.size() && i < j) { 26 return std::vector<std::map<std::string, int>>(data.begin() + i, data.begin() + j); 27 } else { 28 throw std::out_of_range("Slice indices out of range"); 29 } 30 } 31 32 std::string to_string() const { 33 std::stringstream ss; 34 ss << '['; 35 for (size_t i = 0; i < data.size(); ++i) { 36 ss << "{"; 37 for (auto it = data[i].begin(); it != data[i].end(); ++it) { 38 ss << "\"" << it->first << "\": " << it->second; 39 if (std::next(it) != data[i].end()) { 40 ss << ", "; 41 } 42 } 43 ss << "}"; 44 if (i != data.size() - 1) { 45 ss << ", "; 46 } 47 } 48 ss << ']'; 49 return ss.str(); 50 } 51 52private: 53 std::vector<std::map<std::string, int>> data; 54};

To see it in action:

C++
1int main() { 2 std::vector<std::map<std::string, int>> data = { 3 {{"id", 1}, {"value", 100}}, 4 {{"id", 2}, {"value", 200}}, 5 {{"id", 3}, {"value", 300}}, 6 {{"id", 4}, {"value", 400}}, 7 }; 8 DataStream stream(data); 9 10 std::cout << stream.to_string() << std::endl; 11}

It prints: [{"id": 1, "value": 100}, {"id": 2, "value": 200}, {"id": 3, "value": 300}, {"id": 4, "value": 400}].

Lesson Summary

In this lesson, we've explored data streams, discovered how to represent and manipulate them using native C++ data structures, especially vectors and maps, and encapsulated operations on data streams with C++ classes.

Now it's time to apply your newfound knowledge in the practice exercises that follow!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.