# Find Deploy Authors

> Same person. Many different spellings.

Canonical URL: <https://datadriven.io/problems/find_deploy_authors>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The release manager is auditing alice's deployment history but her name appears with inconsistent casing across the logs. Surface all unique author name variants matching 'alice', regardless of capitalization.

## Worked solution and explanation

### Why this problem exists in real interviews

Free-text identity columns always fragment: the same person shows up as 'Alice', 'alice', and 'ALICE' across hundreds of thousands of rows. Audit queries over these fields test whether you reach for a portable case-insensitive match and remember to collapse the spellings before returning them. ILIKE does not exist in SQLite, so the candidate has to know the LOWER trick.

> **Trick to Solving**
>
> "Return every spelling of Alice" means three things stacked: 1) match case-insensitively by lowering the column, 2) collapse duplicate spellings with DISTINCT, 3) sort deterministically so the grader's row order is stable.

---

### Break down the requirements

#### Step 1: Match case-insensitively

SQLite has no ILIKE. The portable pattern is `WHERE LOWER(author) = 'alice'`. Lowering the literal too keeps intent obvious.

#### Step 2: Collapse duplicate spellings

`deploy_logs` has 500,000 rows and one person can ship many deploys, so `SELECT DISTINCT author` returns one row per surviving spelling ('Alice', 'alice', 'ALICE') rather than one row per deploy.

#### Step 3: Order the output

`ORDER BY author` makes the row order deterministic. SQLite's default binary collation puts 'ALICE' before 'Alice' before 'alice', which is fine because every returned value already satisfies the filter.

---

### The solution

**Case-insensitive distinct lookup**

```sql
SELECT DISTINCT author
FROM deploy_logs
WHERE LOWER(author) = 'alice'
ORDER BY author
```

> **Cost Analysis**
>
> `deploy_logs` has 500,000 rows. Wrapping `author` in `LOWER()` blocks any plain index on the column, so this is a full scan. At 500K rows that is still sub-second in SQLite. On a hot path you would add an expression index on `LOWER(author)` or a generated column.

> **Interviewers Watch For**
>
> Strong candidates produce three things without being prompted: `LOWER()` on the column (portable), `DISTINCT` (collapses spellings), and an explicit `ORDER BY` (deterministic). The best also flag that `LOWER(author)` defeats index use and propose an expression index.

> **Common Pitfall**
>
> `WHERE author ILIKE 'alice'` fails in SQLite (no such operator). `WHERE author = 'alice'` silently misses 'Alice' and 'ALICE'. Forgetting `DISTINCT` returns one row per matching deploy log instead of one row per spelling.

---

## Common follow-up questions

- How would you make this query index-friendly on a hot path? _(SQLite supports expression indexes: `CREATE INDEX ix ON deploy_logs(LOWER(author))` lets the planner do an equality seek instead of a scan.)_
- What if you needed authors whose name starts with 'al' instead of equals 'alice'? _(`LOWER(author) LIKE 'al%'` can use an expression index because the wildcard is trailing. A leading wildcard like `%al` cannot.)_
- How would you return deploy counts per matched author rather than just names? _(Swap `DISTINCT` for `GROUP BY author` and add `COUNT(*)`. The `LOWER()` filter still applies.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/find_deploy_authors)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.