# The Most Specific Rule Wins

> Every packet asks the same question of the firewall. Answer it the way the table intends.

Canonical URL: <https://datadriven.io/problems/okta-firewall-rule-precedence>

Domain: Python · Difficulty: medium · Seniority: mid

## Problem

Your edge proxy screens inbound traffic against a firewall table, where each rule pairs a CIDR block with either `allow` or `block`. A source IP's verdict comes from the most specific rule whose range contains it: when two equally specific rules both match, the one later in the table wins, and an address no rule covers is allowed through. Walk the request log and return each source IP's verdict in arrival order.

## Worked solution and explanation

### Why this problem exists in real interviews

This is longest-prefix match, the same lookup every router and firewall runs on every packet, wearing an allow/block costume. The skill being probed is whether you treat a CIDR as an integer-and-mask comparison rather than a string prefix, and whether you get the precedence rules exactly right: more specific beats less specific regardless of table order, equal specificity falls to the later rule, and an unmatched address defaults open. Candidates who sort by specificity first usually fumble the tie-break; candidates who do string matching on '192.168.' break the moment a prefix is not a multiple of 8.

---

### Break down the requirements

#### Step 1: Turn each dotted quad into a 32-bit integer

Split on '.', shift each octet into place: (a << 24) | (b << 16) | (c << 8) | d. Now 'contains' is pure arithmetic, no string games, and a /17 or /23 prefix is no harder than a /8.

#### Step 2: Precompute a mask per rule

A /p prefix means the top p bits matter. Build mask = (0xFFFFFFFF << (32 - p)) & 0xFFFFFFFF, and store network & mask. An IP matches when ip & mask equals that masked network. Note /0 gives mask 0, which correctly matches everything (the default-route trick).

#### Step 3: Pick the winner with a single comparison

For each request, scan rules tracking the best prefix length seen so far. Accept a matching rule when its prefix is greater than OR EQUAL to the current best. Greater handles 'more specific wins'; equal, because you scan in table order, hands ties to the later rule. Start best at -1 and the decision at 'allow' so an unmatched IP falls through to the default.

---

### The solution

**Integer-and-mask longest-prefix match with order-aware ties**

```python
def filter_requests(rules, requests):
    def to_int(ip):
        a, b, c, d = (int(part) for part in ip.split('.'))
        return (a << 24) | (b << 16) | (c << 8) | d

    parsed = []
    for cidr, action in rules:
        network, prefix = cidr.split('/')
        prefix = int(prefix)
        mask = (0xFFFFFFFF << (32 - prefix)) & 0xFFFFFFFF if prefix else 0
        parsed.append((to_int(network) & mask, mask, prefix, action))

    results = []
    for ip in requests:
        ip_int = to_int(ip)
        best_prefix = -1
        decision = 'allow'
        for network, mask, prefix, action in parsed:
            if ip_int & mask == network and prefix >= best_prefix:
                best_prefix = prefix
                decision = action
        results.append(decision)
    return results
```

> **Complexity**
>
> Parsing is O(R) over the rule table. Each request scans all rules, so the query phase is O(Q * R). Space is O(R) for the parsed table. A real firewall with thousands of rules and millions of requests would replace the linear scan with a binary trie (Patricia/radix) for O(prefix-length) lookups, but at interview scale the flat scan is the clear, correct baseline.

**String prefix matching (broken)**

Checking ip.startswith('192.168.') only works when the prefix lands on a dot boundary (/8, /16, /24). A /20 or /27 rule silently never matches, and '10.1.1.1' wrongly matches a rule meant for '10.11.0.0/16'. It looks fine on the easy tests and fails in production.

**Integer mask (correct)**

ip & mask == network compares exactly the top p bits for any p in 0..32. Boundary-independent, and the same three lines handle /0 (match all) and /32 (single host) with no special cases.

> **Common Pitfall**
>
> Using strictly greater than (prefix > best_prefix) for the tie-break. With two equally specific rules, the first one then wins and stays, so a later override never takes effect. The spec says the later rule wins on a tie, which is exactly what >= buys you because the scan is in table order. The other classic miss is initializing the decision to 'block' instead of 'allow', which inverts the default-allow contract.

> **Interviewers Watch For**
>
> Stating the precedence model out loud before coding (most specific wins, ties to later, default allow), reaching for integer masks rather than string slicing, and naming /0 as the default route and /32 as a single host. A strong candidate also asks whether rules can be IPv6 or whether the table is hot-reloaded, signaling they have seen this in a real proxy.

---

## Common follow-up questions

- The table has 50,000 rules and you serve a million requests a second. How do you make lookups sublinear in the rule count? _(Tests knowledge of Patricia/radix tries for longest-prefix match and why a flat scan stops scaling.)_
- How would you extend this to IPv6 without rewriting the matching logic? _(Tests generalizing 32-bit masks to 128-bit integers and keeping the mask comparison engine-agnostic.)_
- Rules now arrive as a live stream of additions and deletions. How do you keep the lookup structure consistent without blocking reads? _(Tests incremental trie updates, copy-on-write snapshots, or versioned tables for concurrent read/update.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/okta-firewall-rule-precedence)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.