Today, we're delving into the important topic of advanced interview problems revolving around Binary Search
. You're likely familiar with the concept of Binary Search – it's an efficient algorithm for finding a specific target in a sorted list by repetitively dividing the search interval in half. Today, we are going to reinforce our understanding by tackling complex data science interview problems using Binary Search.
Imagine a sorted array of integers that has been rotated at an unknown pivot point. This list maintains its sorted order but now starts from a random position. Your task is to find a specific target value within this array and return its index. If the target isn't present, return -1
.
For example, the initial sorted array could be [1, 2, 4, 5, 8, 9, 11, 15]
, but after a single rotation it'd become [8, 9, 11, 15, 1, 2, 4, 5]
.
Example Application: Picture a server system where processes are listed in ascending order based on their IDs. Suppose a disruption rotates this list. Now, the system needs to find a process using a specific ID. A standard binary search isn't sufficient as the list, though sorted, starts at an arbitrary point.
Naive Approach: A straightforward solution involves scanning each element in the array until we find a match or exhaust the list. This linear search approach is simple but computationally expensive for large lists - its time complexity is .
Instead of linear search, binary search can provide a faster solution with a logarithmic time complexity of . This approach narrows down the search space by half at each step. The challenge in this case, caused by the array rotation, is determining which half of the list to contract at each step.
Defining the search area borders with left
and right
pointers (both inclusive, i.e. the [left, right]
interval), we calculate and examine the midpoint. If the midpoint equals our target - success! However, if not, we have four scenarios for updating left
and right
pointers to narrow down the search space:
left
to mid - 1
).
nums[left] <= nums[mid]
and nums[left] <= target < nums[mid]
.mid + 1
to right
).
nums[mid] <= nums[right]
and nums[mid] < target <= nums[right]
.nums[mid] > nums[right]
(the target can't fall into the first half anymore, as this scenario is covered in case 3).Let's translate this into Python:
Python1def search_rotated(nums, target): 2 left, right = 0, len(nums) - 1 3 while left <= right: 4 mid = (left + right) // 2 5 if nums[mid] == target: 6 return mid 7 if nums[left] <= nums[mid] and nums[left] <= target < nums[mid]: 8 right = mid - 1 9 elif nums[mid] <= nums[right] and nums[mid] < target <= nums[right]: 10 left = mid + 1 11 elif nums[mid] > nums[right]: 12 left = mid + 1 13 else: 14 right = mid - 1 15 return -1
This function will successfully handle all the scenarios and provide an efficient way to perform binary search in a rotated sorted array.
In this problem, you are tasked with finding both the first and last positions of a certain target value in a sorted array. If the target is not found within the array, your function should return [-1, -1]
.
Example Application: To make this problem more relatable, picture a situation involving time-series analysis. For instance, you have a sorted array filled with timestamps of user activities. A user could perform the same activity multiple times, and your task is to determine the first and last instance that a particular activity was performed.
Simplistic Approach: An immediate solution could involve scanning the entire array while taking note of the first and last appearances of the target. Although this method is sound and would yield the correct result, it is far from efficient. This linear search approach could result in a worst-case time complexity of .
A more efficient technique utilizes Binary Search to find both ends of the target presence. It essentially involves two separate runs of Binary Search — the first one locates the initial occurrence of the target, and the second finds the final occurrence. Each of these binary searches operates in time, delivering a significantly more efficient solution.
To instantiate this solution, we create a helper function that carries out a specific type of binary search. When searching for the first instance, if our midpoint is the same as or higher than our target, we focus on the array's left half. Contrastingly, for the last instance, if the midpoint is lower than our target, our attention turns to the right half.
Here's how to accomplish this in Python:
Python1def get_first_last_pos(nums, target): 2 def binary_search(left, right, find_first): 3 if left <= right: 4 mid = (left + right) // 2 5 if nums[mid] > target or (find_first and target == nums[mid]): 6 return binary_search(left, mid - 1, find_first) 7 else: 8 return binary_search(mid + 1, right, find_first) 9 return left 10 11 first = binary_search(0, len(nums) - 1, True) 12 last = binary_search(0, len(nums) - 1, False) - 1 13 if first <= last: 14 return [first, last] 15 else: 16 return [-1, -1]
This function optimally utilizes binary search to locate the first and last appearance of a value in a sorted array.
Note: The implementation could be a little bit easier if we make it in two steps:
left
such that nums[left] < target
- that would be the index before the first occurrence of target
.right
such that nums[left] <= target
- that would be the last occurrence of target
.The leftmost occurrence of target
in that case will be left + 1
, and the rightmost - right
. Can you manage to implement this alternative approach yourself?
Our task is to find or determine the index where a target should be inserted in a sorted integer list. Visualize a librarian's task of adding a new book to a properly arranged shelf. We'll do a similar thing but with numbers in a sorted list.
Example Application: To give this problem real-world relevance, picture a document management system where reports are sorted based on their IDs. Suppose a new report comes in, and it has to be placed in the correct position based on its ID. Here, our task mirrors the system's behavior - placing a number correctly in a sorted list.
Simplistic Approach: A basic solution might involve a left-to-right scan of the array, comparing each element with the target until we encounter an element that matches the target (where we return the index), or one that's larger (where we return the current index as that's the insertion point of our target). This approach, however, is inefficient as it demands a full scan of the array in worst-case scenarios (when the target is larger than every existing item). This results in a linear time complexity of , undesirable for gargantuan arrays.
The optimal solution to this problem lies in a modified binary search, which is beneficial due to the sorted nature of our array. Binary search operates by continuously dividing our search space until the target is located or the optimal insertion point is determined.
We use the interval [left, right)
with right
being excluded. Within this interval, we repeatedly calculate the mid (mid
) index. If the target is equal to the mid element, we have our position, and we return mid
. If the target is larger, it is destined for the right half, so we update left
to mid
, switching to the interval [mid, right)
. Conversely, if the target is smaller, it is destined for the left half, so we update right
to mid
(switching to the range [left, mid)
).
The process repeats as long as right - left
is greater than 1, indicating the position where our target should be inserted.
Here's a Python implementation using the interval [left, right)
:
Python1def search_insert(nums, target): 2 nums.append(float('inf')) # append an infinite element to handle edge case 3 left, right = 0, len(nums) 4 while right - left > 1: 5 mid = (left + right) // 2 6 if nums[mid] <= target: 7 left = mid 8 else: 9 right = mid 10 return left
This modified binary search function returns our desired result in a fast and efficient manner. It determines the optimal position for a value in a sorted array, whether or not the value is present in the array.
We've covered two intricate applications of Binary Search, learning how to tailor Binary Search to address complex interview problems. By utilizing an efficient searching algorithm, we've reduced the time complexity from to , making our code faster and more efficient. That's all for now, let's move on to practice!