# Long-Running Feature Flags

> Flags that have been on for too long.

Canonical URL: <https://datadriven.io/problems/long_running_feature_flags>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

We're running a feature flag hygiene audit. Find flags created more than 730 days before May 1, 2026. For each, show the flag name, owner, whether it's still enabled ('Yes' or 'No'), and how many years it's been since creation. Flags with no updated timestamp should be treated as still active.

## Worked solution and explanation

### Why this problem exists in real interviews

This probes your fluency with window functions and partition boundaries, and your ability to decompose multi-step logic into readable CTEs. It is a common at L4+ screen because it requires composing multiple SQL features correctly in a single pass, and small mistakes in logic produce silently wrong results.

> **Trick to Solving**
>
> Read the prompt carefully for implicit constraints. The phrase structure hints at the grain of the output: what each row represents.
> 
> 1. Identify the output grain from the prompt (one row per what?)
> 2. Work backward from the desired output columns
> 3. Build the query inside-out: innermost subquery first, then layer on filters and aggregates

---

### Break down the requirements

#### Step 1: Filter to the target rows

Apply the `WHERE` filter to restrict the working set before aggregation. Filtering early reduces the number of rows that downstream operations process.

#### Step 2: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**Date arithmetic with conditional formatting**

```sql
SELECT flag_name, owner,
    CASE WHEN enabled = 1 THEN 'Yes' ELSE 'No' END AS still_enabled,
    ROUND((JULIANDAY('2025-05-01') - JULIANDAY(created)) / 365.0, 1) AS years_since_creation
FROM feat_flags
WHERE JULIANDAY('2025-05-01') - JULIANDAY(created) > 730
ORDER BY years_since_creation DESC
```

> **Cost Analysis**
>
> The query scans 600 rows from `feat_flags`. The window function requires a sort, which is O(n log n). Pre-aggregating reduces the sort input. CTEs in most engines are optimization fences. For production workloads, consider inlining or materializing the intermediate results.

> **Interviewers Watch For**
>
> Strong candidates explain their choice of window function (`ROW_NUMBER` vs `RANK` vs `DENSE_RANK`) and why it matches the tie semantics. Breaking complex logic into named CTEs shows the interviewer you prioritize readability and debuggability.

> **Common Pitfall**
>
> Using `ROWS` vs `RANGE` in the window frame produces different results when ties exist. Default to `ROWS` unless you specifically need tie grouping.

---

## Common follow-up questions

- Could you solve this without a window function? _(A self-join or correlated subquery is valid but typically slower.)_
- How would the query change if data volume increased 10x? _(Tests consideration of index strategy, partitioning, or materialized views.)_
- What if you needed to handle timezone differences across regions? _(Tests awareness of timestamp normalization in production.)_
- How would you validate that this query is correct on a new dataset? _(Tests approach to verification: spot checks, known aggregates, or row-count sanity.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/long_running_feature_flags)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.