# Adopters Before Migration

> They used the old feature. Did they ever touch the new one?

Canonical URL: <https://datadriven.io/problems/adopters_before_migration>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

The product team is measuring organic adoption of the new editor versus the classic editor. In the page views data, visits to the new editor appear under the URL 'new_editor' and visits to the classic editor under 'classic_editor'. Identify users who visited the new editor but had never visited the classic editor before their first new editor visit. Return their user IDs.

## Worked solution and explanation

### What this is really asking

Anti-join `page_views` against itself: outer side is MIN(viewed_at) per user where page_url='new_editor', inner side is any 'classic_editor' row strictly older than that. Correlated NOT EXISTS, no window function.

---

### Break down the requirements

#### Step 1: Pin each user's first new-editor visit

GROUP BY user_id over page_url='new_editor', taking MIN(viewed_at). Users who never touched the new editor drop out here.

#### Step 2: Reject if any classic visit came earlier

Correlated NOT EXISTS on page_url='classic_editor' AND viewed_at < first_new_date. Strict `<`, not `<=`, so an exact-tie classic row does not disqualify.

---

### The solution

**ADOPTERS BEFORE MIGRATION**

```sql
WITH first_new AS (
  SELECT user_id, MIN(viewed_at) AS first_new_date
  FROM page_views
  WHERE page_url = 'new_editor'
  GROUP BY user_id
)
SELECT fn.user_id
FROM first_new fn
WHERE NOT EXISTS (
  SELECT 1
  FROM page_views ua
  WHERE ua.user_id = fn.user_id
    AND ua.page_url = 'classic_editor'
    AND ua.viewed_at < fn.first_new_date
);
```

> **Cost Analysis**
>
> 800M rows, partitioned by viewed_at. The CTE scans only 'new_editor' partitions; the NOT EXISTS probe wants an index on (user_id, page_url, viewed_at).

> **Interviewers Watch For**
>
> A one-pass MIN per (user_id, page_url) then compare breaks for users with zero classic rows: classic MIN is NULL, comparison is UNKNOWN, row silently dropped instead of kept.

> **Common Pitfall**
>
> Using `<=` disqualifies users whose classic and new visits share an exact timestamp (batch import). The brief says 'before', so strict `<` is correct.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you surface users who tried the new editor and bounced back to classic afterward? _(Flip the NOT EXISTS: classic_editor row with viewed_at > first_new_date.)_
- What changes if viewed_at is a date, not a timestamp? _(Same-day classic and new visits collapse; strict `<` now drops genuine adopters.)_
- How would you make this incremental for a daily job? _(Persist first_new_date per user once frozen, then re-evaluate only users whose first_new_date lands in today's partition.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/adopters_before_migration)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.