Problem Solving: Advanced
Amazon's advertising auction system uses a greedy algorithm to allocate billions of ad impressions per day, and Amazon's warehouse robots use dynamic programming to find optimal pick paths through facilities with thousands of possible routes. These are not theoretical exercises - dynamic programming and greedy algorithms power some of the highest-value software systems ever built, processing decisions worth trillions of dollars annually. The gap between a developer who knows these patterns and one who does not is the gap between writing O(n) solutions and O(2ⁿ) solutions that time out on real data. This lesson closes that gap.
Two-Pointer Technique
Solve pair problems in one pass
Opposite Ends Pattern
Two Sum with Sorted Array
When an array is sorted, two pointers can find a pair that sums to a target in O(n) time instead of O(n²).
Same Direction Pattern
> A sorted list [1, 3, 5, 7, 9] needs to find a pair that sums to 8. Two pointers start at opposite ends. Pick how each pointer moves when the current sum is too small or too large.
nums = [1, 3, 5, 7, 9] target = 8 left, right = 0, len(nums) - 1 while left < right: s = nums[left] + nums[right] if s == target: print(f"Found: {nums[left]} + {nums[right]}") break elif s < target: else: else: print("No pair found")
The two-pointer technique achieves O(n) time by eliminating one element from the search space with every step. Either the left pointer advances or the right pointer retreats, so the total number of moves is at most n.
Hash Map O(1) Lookups
Replace nested loops with hash maps
Hash maps (Python dictionaries) provide O(1) average-case lookup. This single property is the key to optimizing countless algorithms.
Classic Two Sum Problem
Given an array and a target, find two numbers that sum to the target. The brute force O(n²) solution checks every pair. The hash map solution is O(n).
The key insight: instead of searching for each complement with complement in nums (O(n)), we store seen values in a dictionary for O(1) lookup.
Value-to-Index Mapping
- Search each time: O(n)
- m searches: O(m × n)
- 1000 searches in 1000 items: 1M operations
- Build map once: O(n)
- Each lookup: O(1)
- 1000 searches in 1000 items: 2000 operations
Choosing when to use a hash map versus a brute-force scan is one of the most impactful optimization decisions you can make. Build the lookup once and query many times, check for the complement before storing the current value, and use enumerate when you need both index and value.
> This two-sum function stores each number in the hash map before checking for its complement, so a number can match itself when its double equals the target.
Returns [0, 0] because 3 matches itself (3 + 3 = 6) since it was added before checking
Hash map lookups run in O(1) average time. This single property lets you replace an O(n²) double loop with an O(n) single loop: process each element once, look up its complement in constant time.
dict and look up what I need instead? Hash maps are the most common way to turn O(n²) into O(n).Frequency Counting
Count and rank items in one pass
Building Frequency Maps
Find Most Common Element
Check Anagrams
- Finding the most or least common element in a sequence
- Checking if two collections are anagrams or permutations of each other
- Detecting duplicates without sorting the data first
- Grouping items by category and counting each group in a single pass
collections.Counter class does frequency counting automatically. In interviews, show you understand the concept first, then mention Counter as the Pythonic way.> The word "banana" needs to be analyzed to find which character appears most often. Pick the right dictionary method and aggregation function to build and query a frequency map.
word = "banana" freq = {} for ch in word: freq[ch] = freq.(ch, 0) + 1 most = (freq, key=freq.get) print(f"{most}: {freq[most]}")
Python's collections.Counter builds frequency maps in a single call and adds useful methods like most_common(). Use it when writing production code, but understand the manual version for interviews and learning.
Prefix Sum Concept
Answer range sum queries instantly
A prefix sum array stores running totals. Once built, you can calculate the sum of ANY subarray in O(1) time instead of O(n).
Building Prefix Sums
The prefix array has one more element than the original. Each prefix[i] is the sum of all elements before index i.
Range Sum Queries
To get the sum from index i to j (inclusive), compute prefix[j+1] - prefix[i].
> This prefix sum range query uses prefix[j] - prefix[i], but the formula is off by one. It excludes the element at index j from the sum.
Returns 5 instead of 6 for sum of indices 1 to 3 (1 + 4 + 1 = 6)
The prefix array is one index longer than the original so that prefix[0] = 0 acts as a base. This design makes the range sum formula prefix[j+1] - prefix[i] work cleanly for all subarrays, including those starting at index 0.
Prefix sums excel when many range queries run on static data. Build the prefix array once in O(n), then answer each query in O(1). For m queries that is O(n + m) total, far better than O(n × m) with brute force.
Sort as Preprocessing
Use sorting to unlock fast algorithms
Sorting is O(n log n), but it unlocks many efficient algorithms. If your problem allows reordering, sorting is often the key to an elegant solution.
Find Duplicates
Merge Intervals
K Closest Points
Find the k smallest or largest elements by sorting, then taking a slice.
> Given the list [10, 3, 7, 4, 20], you need to check whether any two elements are within 1 of each other. Pick a preprocessing step and a comparison operator to solve it efficiently.
def has_close_pair(nums): for i in range(len(nums) - 1): if abs(nums[i] - nums[i + 1]) 1: return True return False print(has_close_pair([10, 3, 7, 4, 20]))
Sorting as preprocessing is a powerful general strategy. The O(n log n) cost of sorting is often dominated by the O(n) or O(n²) work you save afterward. When an unsorted problem seems to require checking all pairs, sort first and look only at adjacent elements.
- Two pointers: sorted array, use opposing or same-direction pointers for O(n)
- Hash map: store seen values for O(1) complement or membership lookup
- Frequency count:
dict.get(k, 0) + 1to tally any sequence of items - Prefix sum: precompute running totals for O(1) range sum queries
- Sort first: bring related elements adjacent before a single linear scan
You are the lead engineer at a health-tech startup that aggregates patient records from 30 clinics. Each clinic submits CSV files nightly, but naming conventions differ: "Robert Smith" vs "Rob Smith" vs "R. Smith" at different addresses. The analytics team reports that 18% of patient IDs appear to be duplicates, inflating treatment counts and skewing billing. The CEO wants a deduplication system ready within two months.
| record_id | name | dob | address | clinic_id |
|---|---|---|---|---|
| R001 | Robert Smith | 1985-03-15 | 123 Oak St | C01 |
| R002 | Rob Smith | 1985-03-15 | 123 Oak Street | C02 |
| R003 | R. Smith | 1985-03-15 | 123 Oak St Apt 2 | C03 |
With 200,000 records from 30 clinics, how should the system identify which records refer to the same patient?
The deduplication scenario above illustrates a recurring theme: real systems combine multiple algorithmic techniques. Hash-based grouping (O(n)) reduces the comparison space before fuzzy matching, which is the same idea behind choosing the right data structure first.
> You are a senior data engineer at Databricks optimizing a Python pipeline that deduplicates billions of user events per day. Applying the right algorithmic patterns reduces the nightly job from 6 hours to 40 minutes without increasing cluster cost.
user_id and timestamp once at O(n log n) so all subsequent grouping passes run in linear time.Algorithms that unlock efficiency
- Category
- Python
- Difficulty
- advanced
- Duration
- 22 minutes
- Challenges
- 0 hands-on challenges
Topics covered: Two-Pointer Technique, Hash Map O(1) Lookups, Frequency Counting, Prefix Sum Concept, Sort as Preprocessing
Lesson Sections
- Two-Pointer Technique
The two-pointer technique uses two indices that move through an array in a coordinated way. It's powerful for problems involving pairs, palindromes, or partitioning. Opposite Ends Pattern Start one pointer at the beginning and one at the end. Move them toward each other based on some condition. Two Sum with Sorted Array Same Direction Pattern Both pointers start at the beginning. A "fast" pointer explores while a "slow" pointer marks a position. This pattern applies whenever sorted data lets you
- Hash Map O(1) Lookups
Classic Two Sum Problem Value-to-Index Mapping Store values as keys and their indices as values. This pattern solves many "find where" problems. The ordering of operations inside a loop matters enormously. Checking before storing ensures you only match pairs of distinct elements, not a single element matched against itself.
- Frequency Counting
Many problems become trivial once you count occurrences. Frequency maps answer questions like "how many of each?" and "what appears most/least?" Building Frequency Maps The basic pattern for counting involves iterating through items and incrementing a counter for each unique value. Find Most Common Element Once you have frequencies, finding the most common element is straightforward. Check Anagrams Two strings are anagrams if they have exactly the same character frequencies. Frequency maps unloc
- Prefix Sum Concept (concepts: pyPrefixSum)
Building Prefix Sums Range Sum Queries
- Sort as Preprocessing
Find Duplicates In a sorted array, duplicates are always adjacent. One pass finds them all. Merge Intervals A classic problem: merge overlapping intervals. Sorting by start time makes overlaps easy to detect. K Closest Points The algorithms covered in this lesson all follow the same principle: choose a data structure or preprocessing step that reduces the number of comparisons you need to make at each stage. Advanced problem-solving strategies prepare you for the kind of complex, multi-step chal