Understanding Data Sharing Between Threads

Lesson 3

Welcome back! Now that you have a good grasp of thread lifecycles and basic operations, let’s move forward to a critical and exciting part of concurrent programming: data sharing between threads. In this lesson, we will explore how threads can share data using primitive approaches and understand the importance of synchronizing this access.

What You'll Learn

Data sharing between threads, while powerful, can lead to unpredictable behavior if not managed correctly. We will cover:

Shared Variables and Risks of Unsynchronized Access:
- Learn how threads can share data through shared variables.
- Understand the risks of unsynchronized access, such as race conditions.
Introduction to Synchronization Primitives:
- Explore the basic synchronization primitives like std::mutex and std::lock_guard.
- Understand how these tools prevent race conditions by ensuring that only one thread can access the shared resource at a time.
Code Example: Observing Race Conditions and Fixing Them:
- We’ll demonstrate a race condition and then fix it using std::mutex and std::lock_guard.

Let's start by an example that demonstrates the risks of unsynchronized access to shared variables:

C++
1#include <iostream>
2#include <thread>
3
4int counter = 0;
5
6void increment() {
7    for (int i = 0; i < 10000; ++i) {
8        counter++;
9    }
10}
11
12int main() {
13    std::thread t1(increment);
14    std::thread t2(increment);
15    t1.join();
16    t2.join();
17    std::cout << "Final counter (without synchronization): " << counter << std::endl;
18    return 0;
19}

This code might produce different results each time it's run due to race conditions. This is because both threads are accessing the shared variable counter without any synchronization. Here is a quick scenario that explains the issue:

Thread 1 reads the value of counter (let's say it's 15).
Thread 2 reads the value of counter (also 15).
Thread 1 increments counter by 1 and writes the new value (16).
Thread 2 increments counter by 1 and writes the new value — also 16, instead of 17, since it read the value before Thread 1 updated it.
The final value of counter is 16, instead of the expected 17.

Now, since you understand the risks of unsynchronized access, let’s explore how to prevent such issues using synchronization primitives.

Let's start with the most basic synchronization primitive: std::mutex. A mutex is a lock that allows only one thread to access a shared resource at a time. Here’s how you can use it to fix problems like the one we just discussed:

C++
1#include <iostream>
2#include <thread>
3#include <mutex>
4
5class SynchronizedCounter {
6public:
7    void increment() {
8        // Acquire the lock, release it when the function ends. When the lock is acquired, no other thread can access the shared resource.
9        std::lock_guard<std::mutex> lock(mutex_);
10        count_++;
11    }
12
13    int getCount() const {
14        // Acquire the lock, release it when the function ends.
15        std::lock_guard<std::mutex> lock(mutex_);
16        return count_;
17    }
18
19private:
20    mutable std::mutex mutex_;
21    int count_ = 0;
22};
23
24int main() {
25    SynchronizedCounter counter;
26    std::thread t1([&counter]() { for (int i = 0; i < 10000; ++i) counter.increment(); });
27    std::thread t2([&counter]() { for (int i = 0; i < 10000; ++i) counter.increment(); });
28    t1.join();
29    t2.join();
30    std::cout << "Final count with synchronization: " << counter.getCount() << std::endl;
31    return 0;
32}

Let’s break down the code:

We introduced a SynchronizedCounter class that contains a private std::mutex member variable which is the first difference from the previous example.
We added a lock_guard object in the increment and getCount methods. This ensures that only one thread can access the shared resource at a time. When a thread acquires the lock, no other thread can access the shared resource until the lock is released.
We created two threads that increment the counter 10,000 times each. Since the increment method is synchronized, the final count will be 20,000 as expected no matter how many times you run the program.

Let's understand how std::mutex and std::lock_guard work:

std::mutex is a synchronization primitive that provides exclusive access to shared resources. Under the hood, it uses the operating system's native locking mechanism to ensure that only one thread can access the shared resource at a time.
std::lock_guard is a lock wrapper that provides a convenient way to acquire and release a lock. It automatically releases the lock when the lock_guard object goes out of scope, ensuring that the lock is released even if an exception is thrown. Under the hood, std::lock_guard calls mutex.lock() when it's constructed and mutex.unlock() when it's destructed.

Why It Matters

Understanding how to manage data sharing between threads is paramount for writing reliable and efficient concurrent programs. Here’s why:

Avoiding Race Conditions: Race conditions can lead to unpredictable and erroneous behavior in your application. Synchronization helps to prevent such issues.
Data Integrity: By ensuring that shared data is accessed in a controlled manner, you can maintain the integrity of your program’s state.
Enhancing Robustness: Synchronization primitives make your concurrent code more robust and easier to debug, as they eliminate many common concurrency-related bugs.

Excited to dive deeper into this crucial aspect of concurrency? Let’s move on to the practice section and solidify your understanding through hands-on coding.

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.