Lesson 4
Using Heaps in C++ to Calculate Prefix Medians
Introduction

Hello there, budding programmer! I hope you're ready because today, we're going to dive deep into high-level data manipulation and increase our understanding of heaps. Heaps are fundamental data structures commonly used in algorithms. We're going to leverage their potential today in an interesting algorithmic problem. Are you ready for the challenge? Let's get started!

Task Statement

We have a task at hand related to array manipulation and the use of heaps. The task is as follows: Given a vector of unique integers with elements ranging from 11 to 10610^6 and length between 11 to 10001000, we need to create a C++ function prefixMedian(). This function will take the vector as input and return a corresponding vector, which consists of the medians of all the prefixes of the input vector.

Remember that a prefix of a vector is a contiguous subsequence that starts from the first element. The median of a sequence of numbers is the middle number when the sequence is sorted. If the length of the sequence is even, the median is the element in the position length / 2 - 1.

For example, consider an input vector {1, 9, 2, 8, 3}. The output of your function should be {1, 1, 2, 2, 3}.

Heap and Its Operations

A heap is a useful tool in C++ that helps efficiently organize and retrieve data based on their values.

In our context, we use two specific types of heaps: a Min Heap and a Max Heap. The Min Heap is used to store the larger half of the numbers seen so far, while the Max Heap stores the smaller half. In C++, heaps are implemented as priority_queue containers.

For our task, we'll be using these principal operations:

  • Adding Elements: You can add a new element to a heap using the push() method. By default, C++'s priority_queue is a Max Heap. To create a Min Heap, you need to use a custom comparator like greater<int>. This ensures that the smallest element is always at the top.

    C++
    1std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;
  • Removing Elements: The pop() method removes the top element from the heap, which is the smallest element in the Min Heap or the largest element in the Max Heap.

  • Accessing Minimum Element: The top() method is used to inspect the root element of the heap without removing it. For the Min Heap with a greater<int> comparator, top() will return the smallest element.

These operations ensure that the root element can always be gathered quickly, in constant time O(1)O(1), and new elements can be added while maintaining the heap structure in logarithmic time, O(logn)O(log n).

Solution Building: Step 1

Alright, let's break our approach down into manageable steps. To begin with, we're going to need two heaps: a Min Heap to store the larger half of the numbers seen so far and a Max Heap to store the smaller half. We'll also need a vector to store the median for each prefix. Now, let's initialize these.

C++
1#include <iostream> 2#include <vector> 3#include <queue> 4 5std::vector<int> prefixMedian(const std::vector<int>& arr) { 6 std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap; 7 std::priority_queue<int> maxHeap; 8 std::vector<int> medians; 9}
Solution Building: Step 2

As the next step, we will sequentially take each number from the vector and, depending on its value, push it into the minHeap or the maxHeap. If it is smaller than the maximum of the lower half, it will go into the maxHeap. Otherwise, it will go into the minHeap.

C++
1#include <iostream> 2#include <vector> 3#include <queue> 4 5std::vector<int> prefixMedian(const std::vector<int>& arr) { 6 std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap; 7 std::priority_queue<int> maxHeap; 8 std::vector<int> medians; 9 10 for (int num : arr) { 11 if (!maxHeap.empty() && num < maxHeap.top()) { 12 maxHeap.push(num); 13 } else { 14 minHeap.push(num); 15 } 16 } 17}
Solution Building: Step 3

Next, we need to balance the two heaps to ensure that the difference between their sizes is never more than one. This way, we can always have quick access to the median. If the maxHeap size becomes larger than the minHeap, we pop the maxHeap's top element and push it to the minHeap. If the minHeap becomes more than one element larger than the maxHeap, we do the reverse.

C++
1#include <iostream> 2#include <vector> 3#include <queue> 4 5std::vector<int> prefixMedian(const std::vector<int>& arr) { 6 std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap; 7 std::priority_queue<int> maxHeap; 8 std::vector<int> medians; 9 10 for (int num : arr) { 11 if (!maxHeap.empty() && num < maxHeap.top()) { 12 maxHeap.push(num); 13 } else { 14 minHeap.push(num); 15 } 16 17 if (maxHeap.size() > minHeap.size()) { 18 minHeap.push(maxHeap.top()); 19 maxHeap.pop(); 20 } else if (minHeap.size() > maxHeap.size() + 1) { 21 maxHeap.push(minHeap.top()); 22 minHeap.pop(); 23 } 24 } 25}
Solution Building: Step 4

Having balanced the heaps, we've set ourselves up for the effortless retrieval of the median. We compute the median based on the elements at the top of the maxHeap and minHeap, and then append it to our vector of medians.

C++
1#include <iostream> 2#include <vector> 3#include <queue> 4 5std::vector<int> prefixMedian(const std::vector<int>& arr) { 6 std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap; 7 std::priority_queue<int> maxHeap; 8 std::vector<int> medians; 9 10 for (int num : arr) { 11 if (!maxHeap.empty() && num < maxHeap.top()) { 12 maxHeap.push(num); 13 } else { 14 minHeap.push(num); 15 } 16 17 if (maxHeap.size() > minHeap.size()) { 18 minHeap.push(maxHeap.top()); 19 maxHeap.pop(); 20 } else if (minHeap.size() > maxHeap.size() + 1) { 21 maxHeap.push(minHeap.top()); 22 minHeap.pop(); 23 } 24 25 if (minHeap.size() == maxHeap.size()) { 26 medians.push_back(maxHeap.top()); 27 } else { 28 medians.push_back(minHeap.top()); 29 } 30 } 31 32 return medians; 33}
Lesson Summary

Congratulations! You've successfully tackled an interesting algorithmic problem that required the use of heaps for vector manipulation in C++. The solution you've created not only uses heaps but also demonstrates your understanding of array hierarchies and the meaningful interpretation of numerical values.

In the next session, you'll be given more similar problems to solve. This will encourage you to use heaps and array manipulations fluently, helping you consolidate your understanding of today's lesson. Keep practicing, and remember – practice makes perfect. Happy coding!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.