Problem Solving: Advanced

Amazon's advertising auction system uses a greedy algorithm to allocate billions of ad impressions per day, and Amazon's warehouse robots use dynamic programming to find optimal pick paths through facilities with thousands of possible routes. These are not theoretical exercises - dynamic programming and greedy algorithms power some of the highest-value software systems ever built, processing decisions worth trillions of dollars annually. The gap between a developer who knows these patterns and one who does not is the gap between writing O(n) solutions and O(2ⁿ) solutions that time out on real data. This lesson closes that gap.

Two-Pointer Technique

Daily Life
Interviews

Solve pair problems in one pass

The two-pointer technique uses two indices that move through an array in a coordinated way. It's powerful for problems involving pairs, palindromes, or partitioning.

Opposite Ends Pattern

Start one pointer at the beginning and one at the end. Move them toward each other based on some condition.
1def is_palindrome(s):
2 left = 0
3 right = len(s) - 1
4
5 while left < right:
6 if s[left] != s[right]:
7 return False
8 left += 1
9 right -= 1
10
11 return True
12
13print(is_palindrome("racecar"))
14print(is_palindrome("hello"))
>>>Output
True
False

Two Sum with Sorted Array

When an array is sorted, two pointers can find a pair that sums to a target in O(n) time instead of O(n²).

1def two_sum_sorted(numbers, target):
2 left = 0
3 right = len(numbers) - 1
4
5 while left < right:
6 current_sum = numbers[left] + numbers[right]
7
8 if current_sum == target:
9 return [left, right]
10 elif current_sum < target:
11 # Need bigger sum
12 left += 1
13 else:
14 # Need smaller sum
15 right -= 1
16
17 return None
18
19nums = [1, 3, 5, 7, 9, 11]
20print(two_sum_sorted(nums, 12))
>>>Output
[2, 3]
01
Sum too small?
Left pointer moves right toward bigger values
02
Sum too big?
Right pointer moves left toward smaller values
03
Each step counts
Every move eliminates one possibility from the search
04
O(n) total
At most n steps since pointers only move inward

Same Direction Pattern

Both pointers start at the beginning. A "fast" pointer explores while a "slow" pointer marks a position.
1def remove_duplicates_sorted(nums):
2 """Remove duplicates in-place."""
3 if not nums:
4 return 0
5
6 slow = 0
7
8 for fast in range(1, len(nums)):
9 if nums[fast] != nums[slow]:
10 slow += 1
11 nums[slow] = nums[fast]
12
13 return slow + 1
14
15nums = [1, 1, 2, 2, 2, 3, 4, 4]
16length = remove_duplicates_sorted(nums)
17print(f"Unique elements: {nums[:length]}")
>>>Output
Unique elements: [1, 2, 3, 4]
Fill in the Blank

> A sorted list [1, 3, 5, 7, 9] needs to find a pair that sums to 8. Two pointers start at opposite ends. Pick how each pointer moves when the current sum is too small or too large.

nums = [1, 3, 5, 7, 9]
target = 8
left, right = 0, len(nums) - 1
while left < right:
    s = nums[left] + nums[right]
    if s == target:
        print(f"Found: {nums[left]} + {nums[right]}")
        break
    elif s < target:
        
    else:
        
else:
    print("No pair found")

The two-pointer technique achieves O(n) time by eliminating one element from the search space with every step. Either the left pointer advances or the right pointer retreats, so the total number of moves is at most n.

This pattern applies whenever sorted data lets you make a decision that rules out one side. Palindrome checking, three-sum problems, and container-with-most-water all follow the same logic.
TIP
Two-pointer problems are almost always on sorted input. If the problem does not mention sorting, check whether sorting first (O(n log n)) makes a two-pointer pass possible at O(n), giving a net O(n log n) solution instead of O(n²).

Hash Map O(1) Lookups

Daily Life
Interviews

Replace nested loops with hash maps

Hash maps (Python dictionaries) provide O(1) average-case lookup. This single property is the key to optimizing countless algorithms.

Classic Two Sum Problem

Given an array and a target, find two numbers that sum to the target. The brute force O(n²) solution checks every pair. The hash map solution is O(n).

1def two_sum(nums, target):
2 seen = {}
3
4 for i, num in enumerate(nums):
5 complement = target - num
6
7 if complement in seen:
8 return [seen[complement], i]
9
10 seen[num] = i
11
12 return None
13
14nums = [2, 7, 11, 15]
15print(two_sum(nums, 9))
>>>Output
[0, 1]

The key insight: instead of searching for each complement with complement in nums (O(n)), we store seen values in a dictionary for O(1) lookup.

Value-to-Index Mapping

Store values as keys and their indices as values. This pattern solves many "find where" problems.
1def first_occurrence(items, target):
2 """Build lookup once, query many times."""
3 index_of = {item: i for i, item in enumerate(items)}
4
5 if target in index_of:
6 return index_of[target]
7 return -1
8
9fruits = ["apple", "banana", "cherry", "date"]
10print(first_occurrence(fruits, "cherry"))
11print(first_occurrence(fruits, "grape"))
>>>Output
2
-1
Without Hash Map
  • Search each time: O(n)
  • m searches: O(m × n)
  • 1000 searches in 1000 items: 1M operations
With Hash Map
  • Build map once: O(n)
  • Each lookup: O(1)
  • 1000 searches in 1000 items: 2000 operations

Choosing when to use a hash map versus a brute-force scan is one of the most impactful optimization decisions you can make. Build the lookup once and query many times, check for the complement before storing the current value, and use enumerate when you need both index and value.

TIP
Avoid nested loops to check all pairs. Instead, store seen values in a dict and look up complements in O(1). Never rebuild the lookup inside a loop, and always handle missing keys with .get() or in checks.
Debug Challenge

> This two-sum function stores each number in the hash map before checking for its complement, so a number can match itself when its double equals the target.

Returns [0, 0] because 3 matches itself (3 + 3 = 6) since it was added before checking

The ordering of operations inside a loop matters enormously. Checking before storing ensures you only match pairs of distinct elements, not a single element matched against itself.

Hash map lookups run in O(1) average time. This single property lets you replace an O(n²) double loop with an O(n) single loop: process each element once, look up its complement in constant time.

TIP
Whenever you find yourself writing a nested loop to compare all pairs, ask: can I store values in a dict and look up what I need instead? Hash maps are the most common way to turn O(n²) into O(n).

Frequency Counting

Daily Life
Interviews

Count and rank items in one pass

Many problems become trivial once you count occurrences. Frequency maps answer questions like "how many of each?" and "what appears most/least?"

Building Frequency Maps

The basic pattern for counting involves iterating through items and incrementing a counter for each unique value.
1def count_frequencies(items):
2 freq = {}
3 for item in items:
4 if item in freq:
5 freq[item] += 1
6 else:
7 freq[item] = 1
8 return freq
9
10# Shorter version using .get()
11def count_frequencies_v2(items):
12 freq = {}
13 for item in items:
14 freq[item] = freq.get(item, 0) + 1
15 return freq
16
17letters = "mississippi"
18print(count_frequencies(letters))
>>>Output
{'m': 1, 'i': 4, 's': 4, 'p': 2}

Find Most Common Element

Once you have frequencies, finding the most common element is straightforward.
1def most_common(items):
2 freq = {}
3 for item in items:
4 freq[item] = freq.get(item, 0) + 1
5
6 max_count = 0
7 max_item = None
8
9 for item, count in freq.items():
10 if count > max_count:
11 max_count = count
12 max_item = item
13
14 return max_item
15
16votes = ["alice", "bob", "alice", "alice", "bob"]
17print(f"Winner: {most_common(votes)}")
>>>Output
Winner: alice

Check Anagrams

Two strings are anagrams if they have exactly the same character frequencies.
1def are_anagrams(s1, s2):
2 if len(s1) != len(s2):
3 return False
4
5 freq1 = {}
6 freq2 = {}
7
8 for c in s1:
9 freq1[c] = freq1.get(c, 0) + 1
10 for c in s2:
11 freq2[c] = freq2.get(c, 0) + 1
12
13 return freq1 == freq2
14
15print(are_anagrams("listen", "silent"))
16print(are_anagrams("hello", "world"))
>>>Output
True
False
Frequency maps unlock a surprising number of problems beyond simple counting. Once you build the habit of reaching for a dictionary whenever you see "how many" or "which is most common," many problems become almost trivial.
When to Reach for a Frequency Map
  • Finding the most or least common element in a sequence
  • Checking if two collections are anagrams or permutations of each other
  • Detecting duplicates without sorting the data first
  • Grouping items by category and counting each group in a single pass
In practice, Python provides a built-in shortcut for frequency maps.
TIP
Python's collections.Counter class does frequency counting automatically. In interviews, show you understand the concept first, then mention Counter as the Pythonic way.
Fill in the Blank

> The word "banana" needs to be analyzed to find which character appears most often. Pick the right dictionary method and aggregation function to build and query a frequency map.

word = "banana"
freq = {}
for ch in word:
    freq[ch] = freq.(ch, 0) + 1
most = (freq, key=freq.get)
print(f"{most}: {freq[most]}")
Frequency maps are one of the most versatile tools in a programmer's toolkit. Any time a problem asks about counts, rankings, or repetition, a dictionary of counts is likely the right starting structure.

Python's collections.Counter builds frequency maps in a single call and adds useful methods like most_common(). Use it when writing production code, but understand the manual version for interviews and learning.

TIP
Recognizing frequency-counting problems gets easier with practice. Key signals: "most common," "appears more than k times," "check if two strings are anagrams," or "find a majority element."

Prefix Sum Concept

Daily Life
Interviews

Answer range sum queries instantly

A prefix sum array stores running totals. Once built, you can calculate the sum of ANY subarray in O(1) time instead of O(n).

Building Prefix Sums

1def build_prefix_sum(nums):
2 prefix = [0]
3 total = 0
4
5 for num in nums:
6 total += num
7 prefix.append(total)
8
9 return prefix
10
11nums = [3, 1, 4, 1, 5]
12prefix = build_prefix_sum(nums)
13print(f"Original: {nums}")
14print(f"Prefix: {prefix}")
>>>Output
Original: [3, 1, 4, 1, 5]
Prefix: [0, 3, 4, 8, 9, 14]

The prefix array has one more element than the original. Each prefix[i] is the sum of all elements before index i.

Range Sum Queries

To get the sum from index i to j (inclusive), compute prefix[j+1] - prefix[i].

1def range_sum(prefix, i, j):
2 """Sum from index i to j."""
3 return prefix[j + 1] - prefix[i]
4
5nums = [3, 1, 4, 1, 5]
6prefix = [0, 3, 4, 8, 9, 14]
7
8# Sum from index 1 to 3 (1 + 4 + 1 = 6)
9print(f"Sum [1:3] = {range_sum(prefix, 1, 3)}")
10
11# Sum from index 0 to 4 (entire array = 14)
12print(f"Sum [0:4] = {range_sum(prefix, 0, 4)}")
13
14# Sum from index 2 to 2 (just element 4)
15print(f"Sum [2:2] = {range_sum(prefix, 2, 2)}")
>>>Output
Sum [1:3] = 6
Sum [0:4] = 14
Sum [2:2] = 4
Many range sum queries
Many range sum queries
When you need repeated sums over different slices of static data
Finding target subarrays
Finding target subarrays
Locate subarrays whose elements sum to a specific target value
Running averages
Running averages
Calculate the average of any window without re-summing each time
Repeated range operations
Repeated range operations
Any "sum between index i and j" that runs inside a loop
Debug Challenge

> This prefix sum range query uses prefix[j] - prefix[i], but the formula is off by one. It excludes the element at index j from the sum.

Returns 5 instead of 6 for sum of indices 1 to 3 (1 + 4 + 1 = 6)

The prefix array is one index longer than the original so that prefix[0] = 0 acts as a base. This design makes the range sum formula prefix[j+1] - prefix[i] work cleanly for all subarrays, including those starting at index 0.

Prefix sums excel when many range queries run on static data. Build the prefix array once in O(n), then answer each query in O(1). For m queries that is O(n + m) total, far better than O(n × m) with brute force.

TIP
Prefix sums also extend to 2D arrays for rectangle sum queries, and to prefix products or prefix XOR for other problems. The same idea of precomputing cumulative values applies broadly.

Sort as Preprocessing

Daily Life
Interviews

Use sorting to unlock fast algorithms

Sorting is O(n log n), but it unlocks many efficient algorithms. If your problem allows reordering, sorting is often the key to an elegant solution.

Find Duplicates

In a sorted array, duplicates are always adjacent. One pass finds them all.
1def find_duplicates(nums):
2 sorted_nums = sorted(nums)
3 duplicates = []
4
5 for i in range(1, len(sorted_nums)):
6 if sorted_nums[i] == sorted_nums[i - 1]:
7 if not duplicates or duplicates[-1] != sorted_nums[i]:
8 duplicates.append(sorted_nums[i])
9
10 return duplicates
11
12nums = [4, 2, 7, 2, 1, 4, 9, 4]
13print(find_duplicates(nums))
>>>Output
[2, 4]

Merge Intervals

A classic problem: merge overlapping intervals. Sorting by start time makes overlaps easy to detect.
1def merge_intervals(intervals):
2 if not intervals:
3 return []
4
5 # Sort by start time
6 intervals.sort(key=lambda x: x[0])
7
8 merged = [intervals[0]]
9
10 for current in intervals[1:]:
11 last = merged[-1]
12
13 if current[0] <= last[1]:
14 last[1] = max(last[1], current[1])
15 else:
16 merged.append(current)
17
18 return merged
19
20intervals = [[1, 3], [8, 10], [2, 6], [15, 18]]
21print(merge_intervals(intervals))
>>>Output
[[1, 6], [8, 10], [15, 18]]

K Closest Points

Find the k smallest or largest elements by sorting, then taking a slice.

1def k_smallest(nums, k):
2 """Return k smallest elements."""
3 sorted_nums = sorted(nums)
4 return sorted_nums[:k]
5
6def k_largest(nums, k):
7 """Return k largest elements."""
8 sorted_nums = sorted(nums, reverse=True)
9 return sorted_nums[:k]
10
11nums = [7, 3, 9, 1, 5, 8, 2]
12print(f"3 smallest: {k_smallest(nums, 3)}")
13print(f"3 largest: {k_largest(nums, 3)}")
>>>Output
3 smallest: [1, 2, 3]
3 largest: [9, 8, 7]
SEARCHPOINTERSDEDUPMERGESELECT
SEARCH
Binary search
O(log n) lookup on sorted
POINTERS
Two-pointer
Converging scans in O(n)
DEDUP
Adjacent dupes
Duplicates sit side by side
MERGE
Efficient merge
Combine sorted runs in O(n)
SELECT
K-th element
Slice first or last k items
TIP
When you see a problem that seems to require O(n²) comparisons, ask: "Would sorting help?" The O(n log n) sort cost is usually worth it.
Fill in the Blank

> Given the list [10, 3, 7, 4, 20], you need to check whether any two elements are within 1 of each other. Pick a preprocessing step and a comparison operator to solve it efficiently.

def has_close_pair(nums):
    
    for i in range(len(nums) - 1):
        if abs(nums[i] - nums[i + 1])  1:
            return True
    return False

print(has_close_pair([10, 3, 7, 4, 20]))

Sorting as preprocessing is a powerful general strategy. The O(n log n) cost of sorting is often dominated by the O(n) or O(n²) work you save afterward. When an unsorted problem seems to require checking all pairs, sort first and look only at adjacent elements.

The algorithms covered in this lesson all follow the same principle: choose a data structure or preprocessing step that reduces the number of comparisons you need to make at each stage.
Algorithm Pattern Quick Reference
  • Two pointers: sorted array, use opposing or same-direction pointers for O(n)
  • Hash map: store seen values for O(1) complement or membership lookup
  • Frequency count: dict.get(k, 0) + 1 to tally any sequence of items
  • Prefix sum: precompute running totals for O(1) range sum queries
  • Sort first: bring related elements adjacent before a single linear scan
The Deduplication EngineStep 1
>

You are the lead engineer at a health-tech startup that aggregates patient records from 30 clinics. Each clinic submits CSV files nightly, but naming conventions differ: "Robert Smith" vs "Rob Smith" vs "R. Smith" at different addresses. The analytics team reports that 18% of patient IDs appear to be duplicates, inflating treatment counts and skewing billing. The CEO wants a deduplication system ready within two months.

patient_records
record_idnamedobaddressclinic_id
R001Robert Smith1985-03-15123 Oak StC01
R002Rob Smith1985-03-15123 Oak StreetC02
R003R. Smith1985-03-15123 Oak St Apt 2C03
May 2026
Detection Strategy

With 200,000 records from 30 clinics, how should the system identify which records refer to the same patient?

Advanced problem-solving strategies prepare you for the kind of complex, multi-step challenges found in real-world software development. Put your skills to the test with hands-on challenges in the Python Builder.

The deduplication scenario above illustrates a recurring theme: real systems combine multiple algorithmic techniques. Hash-based grouping (O(n)) reduces the comparison space before fuzzy matching, which is the same idea behind choosing the right data structure first.

TIP
When approaching a system design problem, identify the bottleneck operation and apply the appropriate algorithmic pattern: hash map for lookups, sorting for proximity, prefix sums for ranges, and two pointers for converging scans.
PUTTING IT ALL TOGETHER

> You are a senior data engineer at Databricks optimizing a Python pipeline that deduplicates billions of user events per day. Applying the right algorithmic patterns reduces the nightly job from 6 hours to 40 minutes without increasing cluster cost.

The two-pointer technique eliminates a nested loop in the sorted deduplication step, cutting time complexity from O(n²) to O(n) on billion-row event partitions.
Hash map O(1) lookups replace repeated linear scans when checking whether an event ID has already been seen, the core of the deduplication logic.
Frequency counting identifies which event types appear suspiciously often, flagging data quality issues before they corrupt downstream aggregation tables.
Sort as preprocessing reorders events by user_id and timestamp once at O(n log n) so all subsequent grouping passes run in linear time.
KEY TAKEAWAYS
Two pointers: opposite ends or same direction patterns
Hash maps turn O(n) lookups into O(1)
Frequency counting solves "how many" and anagram problems
Prefix sums enable O(1) range sum queries
Sorting (O(n log n)) often enables O(n) algorithms
Recognize these patterns to optimize brute force solutions

Algorithms that unlock efficiency

Category
Python
Difficulty
advanced
Duration
22 minutes
Challenges
0 hands-on challenges

Topics covered: Two-Pointer Technique, Hash Map O(1) Lookups, Frequency Counting, Prefix Sum Concept, Sort as Preprocessing

Lesson Sections

  1. Two-Pointer Technique

    The two-pointer technique uses two indices that move through an array in a coordinated way. It's powerful for problems involving pairs, palindromes, or partitioning. Opposite Ends Pattern Start one pointer at the beginning and one at the end. Move them toward each other based on some condition. Two Sum with Sorted Array Same Direction Pattern Both pointers start at the beginning. A "fast" pointer explores while a "slow" pointer marks a position. This pattern applies whenever sorted data lets you

  2. Hash Map O(1) Lookups

    Classic Two Sum Problem Value-to-Index Mapping Store values as keys and their indices as values. This pattern solves many "find where" problems. The ordering of operations inside a loop matters enormously. Checking before storing ensures you only match pairs of distinct elements, not a single element matched against itself.

  3. Frequency Counting

    Many problems become trivial once you count occurrences. Frequency maps answer questions like "how many of each?" and "what appears most/least?" Building Frequency Maps The basic pattern for counting involves iterating through items and incrementing a counter for each unique value. Find Most Common Element Once you have frequencies, finding the most common element is straightforward. Check Anagrams Two strings are anagrams if they have exactly the same character frequencies. Frequency maps unloc

  4. Prefix Sum Concept (concepts: pyPrefixSum)

    Building Prefix Sums Range Sum Queries

  5. Sort as Preprocessing

    Find Duplicates In a sorted array, duplicates are always adjacent. One pass finds them all. Merge Intervals A classic problem: merge overlapping intervals. Sorting by start time makes overlaps easy to detect. K Closest Points The algorithms covered in this lesson all follow the same principle: choose a data structure or preprocessing step that reduces the number of comparisons you need to make at each stage. Advanced problem-solving strategies prepare you for the kind of complex, multi-step chal