# Feature Vote Winner

> Users voted with their clicks. Who won?

Canonical URL: <https://datadriven.io/problems/feature_vote_winner>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

We have a feature-voting system where each user splits a single vote equally among all the flags they voted for. Some users abstained (null entries). Calculate the effective vote tally for each flag and find the winner. Include ties. Return the flag name and effective vote total.

## Worked solution and explanation

### Why this problem exists in real interviews

There is no votes table. `feat_flags.owner` doubles as voter identity, so a user owning three flags gives each of those flags one third of a vote. Once the table is self-describing the ballot, the rest is reciprocal weighting and a tie-safe max.

---

### Break down the requirements

#### Step 1: Count each voter's ballot size

For every non-null `owner`, count flags owned. That count is the denominator. NULL owners are abstentions and never reach the per-voter aggregate.

#### Step 2: Weight each flag by 1 over ballot size

Force float arithmetic with `1.0 / CAST(num_votes AS REAL)`. Integer division yields 0 for any voter with two or more picks, collapsing the whole tally.

#### Step 3: Pick the max and keep ties

A scalar subquery `(SELECT MAX(effective_votes) FROM weighted)` lets multiple winners share the top. `ORDER BY ... LIMIT 1` would silently drop ties.

---

### The solution

**FEATURE VOTE WINNER**

```sql
WITH voter_counts AS (
  SELECT owner, COUNT(*) AS num_votes
  FROM feat_flags
  WHERE owner IS NOT NULL
  GROUP BY owner
),
weighted AS (
  SELECT f.flag_name,
         SUM(1.0 / CAST(vc.num_votes AS REAL)) AS effective_votes
  FROM feat_flags f
  JOIN voter_counts vc ON f.owner = vc.owner
  WHERE f.owner IS NOT NULL
  GROUP BY f.flag_name
)
SELECT flag_name, effective_votes
FROM weighted
WHERE effective_votes = (SELECT MAX(effective_votes) FROM weighted)
ORDER BY flag_name
```

> **Cost Analysis**
>
> At 400 rows the plan is irrelevant. At millions of flags, swap the MAX subquery for `RANK() OVER (ORDER BY effective_votes DESC)` and keep `WHERE rnk = 1`: one pass, still tie-safe.

> **Interviewers Watch For**
>
> Two things: whether you spot that `owner` is the voter, and whether you cast to REAL before dividing. Forgetting the cast on SQLite returns floored integers; the bug only shows when num_votes is 2 or more.

> **Common Pitfall**
>
> `=` on summed reciprocals is floating point equality. `1.0/3 + 1.0/3 + 1.0/3` is not exactly `3.0/3`. Low risk at 400 rows; on denser ballots prefer `ABS(effective_votes - max_votes) < 1e-9` or rational arithmetic.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Rewrite without a scalar subquery, using window functions instead. _(Tests whether the engineer can replace `MAX()` filtering with `RANK() OVER (ORDER BY ... DESC)` and still preserve ties.)_
- What if a separate `flag_votes(user_id, flag_id)` table existed? _(Probes normalization instincts and whether the engineer can refactor the same weighting logic across a proper many-to-many schema.)_
- How would you report the runner-up, including ties for second place? _(Forces the conversation from `MAX` toward `DENSE_RANK` and clarifies the difference between `RANK` and `DENSE_RANK` when ties exist.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/feature_vote_winner)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.