# Feature Flag Fan vs Detractor Pairs

> Some users love the flag. Others want it gone.

Canonical URL: <https://datadriven.io/problems/feature_flag_fan_vs_detractor_pairs>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

Rank enabled users by rollout percentage descending, and disabled users by rollout ascending. Pair the #1 fan with the #1 detractor, #2 with #2, and so on. Break ties by flag ID ascending, and show both user IDs.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes whether you know that `RANK` and `DENSE_RANK` will silently break a positional join. Anyone can write two ranked CTEs and join them. The interviewer wants you to reach for `ROW_NUMBER` specifically, and to add an explicit tiebreaker so the same query returns the same pairings between runs. Non-deterministic output is a real ding.

---

### Break down the requirements

#### Step 1: Split the universe in two

Two CTEs over `feat_flags`, one with `WHERE enabled = 1`, one with `WHERE enabled = 0`. Same table, two filtered slices. Don't try to do this in a single window with `PARTITION BY enabled`, you'd still need to align positions across partitions.

#### Step 2: Rank each side oppositely

Fans: `ROW_NUMBER() OVER (ORDER BY rollout DESC, flag_id ASC)`. Detractors: `ROW_NUMBER() OVER (ORDER BY rollout ASC, flag_id ASC)`. The `flag_id ASC` tiebreaker is required, not optional, since `rollout` is not unique.

#### Step 3: Zip on the rank

`INNER JOIN ... ON f.rn = o.rn`. The join key is the position, not any business column. This is the move people miss: rank becomes a synthetic primary key for alignment.

#### Step 4: Order the output

Final `ORDER BY f.rn` so pair #1 is on top. Without it the planner may surface pairs in hash-join order, which looks random to a reviewer.

---

### The solution

**POSITIONAL ZIP VIA ROW_NUMBER**

```sql
WITH fans AS (
  SELECT owner,
         ROW_NUMBER() OVER (ORDER BY rollout DESC, flag_id ASC) AS rn
  FROM feat_flags
  WHERE enabled = 1
),
opponents AS (
  SELECT owner,
         ROW_NUMBER() OVER (ORDER BY rollout ASC, flag_id ASC) AS rn
  FROM feat_flags
  WHERE enabled = 0
)
SELECT f.owner AS fan_owner,
       o.owner AS opponent_owner
FROM fans f
INNER JOIN opponents o ON f.rn = o.rn
ORDER BY f.rn;
```

> **Cost Analysis**
>
> 600 rows total, so plan cost barely matters here, but the shape is what scales. Each CTE does one filtered scan plus a sort on (rollout, flag_id). The join is a hash or merge on the `rn` integer, dense from 1..N, basically free. At million-row scale you'd want a covering index on `(enabled, rollout, flag_id)` so each side becomes an index-only scan and the sort drops out.

> **Interviewers Watch For**
>
> Before you write anything, ask: what if the two sides have different counts? `INNER JOIN` silently drops the tail of the longer side. If interviewers want all fans listed even when detractors run out, that's a `LEFT JOIN` from fans with `NULL` opponents. Naming this tradeoff out loud is the signal.

> **Common Pitfall**
>
> Reaching for `RANK()` or `DENSE_RANK()` instead of `ROW_NUMBER()`. The moment two flags tie on `rollout`, both get the same rank, the join becomes many-to-many on that rank, and your output Cartesian-multiplies. A 3-way tie on each side turns one expected pair into nine rows. `ROW_NUMBER` is the only ranking function that guarantees 1:1.

**DENSE_RANK with a 3-way tie at rollout=80**

fans:  (A,1) (B,1) (C,1) (D,2)
detractors: (W,1) (X,1) (Y,1) (Z,2)
Join on rank=1 produces 9 rows: A-W, A-X, A-Y, B-W, B-X, B-Y, C-W, C-X, C-Y. Not what you want.

**ROW_NUMBER with flag_id tiebreaker**

fans:  (A,1) (B,2) (C,3) (D,4)
detractors: (W,1) (X,2) (Y,3) (Z,4)
Join on rn produces 4 rows: A-W, B-X, C-Y, D-Z. Clean 1:1 zip, deterministic across runs.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- What changes if fans and detractors have different counts and the product wants every fan represented? _(Tests whether the candidate sees `INNER JOIN` as data-lossy and reaches for `LEFT JOIN` from the side that must survive.)_
- How would you adapt this to pair the top-N fans against the top-N detractors per `owner` team? _(Probes whether they can add `PARTITION BY owner` to both windows and still keep the positional join semantics intact.)_
- If `rollout` had no tiebreaker available at all, how would you make the result deterministic? _(Forces the conversation about whether you fabricate a tiebreaker (`flag_name`, hash of row) or accept non-determinism and document it.)_
- Why not `FULL OUTER JOIN` on `rn` and report unpaired rows separately? _(Tests engine awareness (SQLite lacks native `FULL OUTER` until 3.39) and product framing of what 'unpaired' means in this report.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/feature_flag_fan_vs_detractor_pairs)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.