Binary Search
Popcorn Hack
Popcorn Hack 1 (from CB MC 2020)
The procedure BinarySearch(numList, target) correctly implements a binary search algorithm on the list of numbers numList. The procedure returns an index here target occurs in numList, or -1 if target does not occur in numList.
Which of the following conditions must be met in order for the procedure to work as intended? Explain why.
a) The length of numList must be even
b) The list numList must not contain any duplicate values
c) The values in numList must be in sorted order
d) The value of target must not be equal to -1
Option C must be met because binary search only works on a list that is sorted in order. The algorithm relies on comparing the target value to the middle element and deciding whether to search the left or right half. If the list is not sorted, the search will not work correctly.
Popcorn Hack 2
Which of the following statements correctly describes a disadvantage of binary search compared to linear search? Explain why your answer is correct and why the others are wrong.
a) Binary search takes more time on average than linear search
b) Binary search cannot be used on unsorted lists without modifications
c) Binary search always returns the first occurrence of the target
d) Binary search can only be used on lists with unique values
Option B is correct because the list has to be sorted in order to function while a linear seach doesn’t need sorting. Option A is wrong because a binary search tends to be faster than linear. Option C is wrong because the search starts in the middle of the array. Option D is wrong because it would not matter if the array has non-unique values in a binary search.
Popcorn Hack 3
Given the sorted list:
['a', 'b', 'c', 'd', 'e', 'f', 'g']
Code a binary search algorithm in your notebook that returns the index when given an element of the array (eg. an input of ‘c’ should return 2).
import math
list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
def binary_search(list, value):
start = 0
end = len(list)-1
def search(start, end):
index = math.floor((end+start)/2)
if list[index] == value:
return index
if list[index] > value:
return search(start, index-1)
else:
return search(index+1, end)
return search(start, end)
found_index = binary_search(list, "c")
print(found_index)
2
Homework Hack
Goal:
Use Pandas to load and sort product prices, then write a binary search function to find specific price values. —
Instructions
Load the dataset using Pandas.
Drop any rows with missing data.
Sort the data by the Price column.
Extract the sorted Price column as a list.
Implement a binary search function that searches for a price in the list.
Use your function to search for these 3 specific prices:
1.25
6.49
10.00
Print a message that clearly shows if each price was found or not found.
Write a short explanation on how your code works.
import pandas as pd
import math
# Load the dataset
data = pd.read_csv("school_supplies.csv")
# Drop rows with missing values
data_cleaned = data.dropna()
# Sort the data by 'Price'
data_sorted = data_cleaned.sort_values(by="Price")
# Extract sorted prices as a list
price_list = data_sorted["Price"].tolist()
# Preview the sorted data
print("First few rows of sorted data:")
print(data_sorted.head())
print("Original row count:", len(data))
print("Cleaned row count:", len(data_cleaned))
def binary_search(list, value):
start = 0
end = len(list)-1
def search(start, end):
index = math.floor((end+start)/2)
if list[index] == value:
return index
if list[index] > value:
return search(start, index-1)
if index == end and list[index] != value:
return -1
else:
return search(index+1, end)
return search(start, end)
prices_to_find = [1.25, 6.49, 10.00]
for price in prices_to_find:
price_index = binary_search(price_list, price)
if price_index == -1:
print("Price not found: $" + str(price))
else:
print("Price $" + str(price) + " found at index:", price_index)
First few rows of sorted data:
Product Price
5 Eraser 0.50
14 Paper Clips 0.89
2 Pencil 0.99
9 Glue Stick 1.25
1 Pen 1.50
Original row count: 15
Cleaned row count: 15
Price $1.25 found at index: 3
Price $6.49 found at index: 12
Price not found: $10.0