# Experiment Variant Ratios

> Control versus treatment. The participation split.

Canonical URL: <https://datadriven.io/problems/experiment_variant_ratios>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

For each experiment, show the number of users in control, the number in treatment, and the ratio of treatment to control. If an experiment has no control users, the ratio should be null.

## Worked solution and explanation

### Why this problem exists in real interviews

Variant ratio analysis checks whether experiment groups are balanced. This tests per-experiment aggregation with conditional counting and ratio computation.

---

### Break down the requirements

#### Step 1: Count users per variant per experiment

`GROUP BY exp_name` with conditional counts: `COUNT(DISTINCT CASE WHEN variant = 'control' THEN user_id END)` for control, similar for treatment.

#### Step 2: Compute the ratio

Divide treatment count by control count with NULLIF guard.

---

### The solution

**Conditional counts with ratio computation**

```sql
SELECT exp_name,
       COUNT(DISTINCT CASE WHEN variant = 'control' THEN user_id END) AS control_users,
       COUNT(DISTINCT CASE WHEN variant = 'treatment' THEN user_id END) AS treatment_users,
       ROUND(
           COUNT(DISTINCT CASE WHEN variant = 'treatment' THEN user_id END) * 1.0 /
           NULLIF(COUNT(DISTINCT CASE WHEN variant = 'control' THEN user_id END), 0), 3
       ) AS treatment_control_ratio
FROM experiments
GROUP BY exp_name
ORDER BY exp_name
```

> **Cost Analysis**
>
> Single scan with conditional distinct counts. Experiment tables are small.

> **Interviewers Watch For**
>
> Strong candidates flag imbalanced ratios (e.g., 70/30 instead of 50/50) as a data quality issue.

> **Common Pitfall**
>
> Using COUNT(*) instead of COUNT(DISTINCT user_id) counts events, not users. A user generating many events skews the ratio.

---

## Common follow-up questions

- What ratio imbalance would concern you? _(Tests sample ratio mismatch (SRM) awareness.)_
- How would you detect statistically significant imbalance? _(Tests chi-squared test knowledge.)_
- What if users could switch variants? _(Tests intent-to-treat analysis awareness.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/experiment_variant_ratios)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.