# The Plan That Changed Twice This Month

> Subscribers come, go, downgrade, and share. The schema has to keep up.

Canonical URL: <https://datadriven.io/problems/the_plan_that_changed_twice_this_month>

Domain: Data Modeling · Difficulty: medium · Seniority: L5

## Problem

A subscription streaming service needs a data warehouse that tracks subscriber lifecycle events, viewing activity, and plan changes. The business measures churn, downgrade rates, and content engagement. Design the schema that supports all three.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes whether you can model a lifecycle that mutates more than once in a single accounting period. The signal is choosing an event-grain `fact_subscriptions` so a start, a downgrade, and a cancel inside the same month are three separate rows, each reconcilable to a dated plan snapshot.

> **Trick to Solving**
>
> Before drawing any tables, a strong candidate asks: "how many state changes can happen to one subscription in one billing period, and do we need to reconstruct the state at any past timestamp?" If the answer is "more than one," the fact grain has to be the event, not the period.
> 
> 1. Declare `fact_subscriptions` grain = one row per lifecycle event
> 2. Keep plan attributes stable on `dim_plan`; put changes on the event row
> 3. Separate viewing activity from subscription events
> 4. Reserve GDPR-safe PII to `dim_subscriber` only

---

### Break down the requirements

#### Step 1: Model lifecycle as events, not states

`fact_subscriptions` records `event_type` (start, pause, resume, plan_change, cancel) with `mrr_delta`. Two plan changes in one month become two rows with additive deltas.

#### Step 2: Keep dim_plan flat

Plan price and ad support live on `dim_plan`. A plan change references a different `plan_sk` on the next event row rather than versioning the dimension itself.

#### Step 3: Put viewing on its own fact

`fact_viewing` has one row per playback with `watch_seconds` and `ad_break_count`. This keeps content engagement decoupled from subscription lifecycle.

#### Step 4: Anchor subscriber attributes as stable

`dim_subscriber` stores country, acquisition channel, and cohort. Plan-level attributes never live here; they move via the event stream.

#### Step 5: Use dim_date as the conformed calendar

Both facts join `dim_date` so monthly churn and engagement share a single calendar.

---

### The solution

Below is one conceptually sound model. The grain on `fact_subscriptions` is the event, which lets multiple state changes coexist in one period.

> **Why this design holds up**
>
> Event-grain facts make the monthly churn and downgrade counts sum-of-rows rather than a window function over state. Net MRR for any period is just `SUM(mrr_delta)` bounded by the window. A plan change that reverses itself a week later still nets to zero.

> **What strong candidates do**
>
> They pick event grain the moment they hear "changed twice this month." They explain why SCD Type 2 on `dim_plan` is the wrong tool for a customer-side state change. They separate `fact_viewing` so engagement analysis is not blocked by lifecycle joins.

> **Red flags to avoid**
>
> Modeling plan changes as Type 2 on `dim_plan` conflates "the customer changed plans" with "the plan itself changed price." Flattening lifecycle into one snapshot row per subscription erases the second change in the month. Attaching viewing data to the same fact forces nulls on subscription events.

---

### The analysis pattern

**Monthly churn, downgrades, and net MRR by tier**

```sql
SELECT
    d.month,
    p.tier,
    COUNT(*) FILTER (WHERE e.event_type = 'cancel') AS churns,
    COUNT(*) FILTER (WHERE e.event_type = 'downgrade') AS downgrades,
    SUM(e.mrr_delta) AS net_mrr
FROM fact_subscriptions e
JOIN dim_plan p ON p.plan_sk = e.plan_sk
JOIN dim_date d ON d.date_key = CAST(TO_CHAR(e.event_ts, 'YYYYMMDD') AS INT)
GROUP BY d.month, p.tier
```

---

### Trade-offs and alternatives

**Event-grain subscription fact**

Multiple intra-period changes are first-class rows. Additive aggregation. Point-in-time state requires a window function.

**Periodic snapshot fact**

One row per subscription per day. Point-in-time queries become trivial filters. Storage grows linearly with active subs. Multi-event days compress into the last observed state.

---

## Common follow-up questions

- How would you compute the exact plan on each day of the month when a subscriber changed plans twice? _(Tests as-of reconstruction from an event stream without SCD Type 2.)_
- Where does ad revenue fit if the ad-supported tier pays differently per impression than per subscription? _(Tests whether the candidate adds a separate `fact_ad_impressions` fact.)_
- A subscriber requests GDPR deletion. Which tables are touched and which retain event history? _(Tests PII isolation on `dim_subscriber` and retention of financial events.)_
- How would you materialize a daily active-subscriber snapshot from the event fact? _(Tests derivation of periodic snapshots from transaction-grain facts.)_
- If viewing volume hits 500M rows a day, how would you partition `fact_viewing`? _(Tests partition key choice and its effect on engagement queries.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_plan_that_changed_twice_this_month)
- [Data Modeling Interview Questions](https://datadriven.io/data-modeling-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.