# Authors Deploying to Dev and Production

> Dev, staging, production. Who has touched all three?

Canonical URL: <https://datadriven.io/problems/authors_deploying_to_dev_and_production>

Domain: SQL · Difficulty: medium · Seniority: L4

## Problem

The release engineering team wants to identify authors who ship to both the dev and production environments (case-insensitive match). Show each qualifying author and how many of those two environments they have deployed to, listed alphabetically.

## Worked solution and explanation

### What this is really asking

`LOWER(env_name)` does double duty: filters 500k rows to 'dev' and 'production' variants, and deduplicates inside COUNT DISTINCT. Skip LOWER and 'Dev' or 'PRODUCTION' rows silently drop out.

---

### Break down the requirements

#### Step 1: Case-insensitive filter

Normalize env_name with LOWER and keep only the two target values. Raw comparison misses 'Dev' or 'PRODUCTION' rows.

#### Step 2: Require BOTH envs per author

GROUP BY author, then HAVING COUNT(DISTINCT LOWER(env_name)) = 2. DISTINCT stops an author with 50 dev deploys passing on volume alone.

#### Step 3: Report and sort

Return author and env_count (always 2 after HAVING), alphabetized.

---

### The solution

**AUTHORS DEPLOYING TO DEV AND PRODUCTION**

```sql
SELECT author,
       COUNT(DISTINCT LOWER(env_name)) AS env_count
FROM deploy_logs
WHERE LOWER(env_name) IN ('dev', 'production')
GROUP BY author
HAVING COUNT(DISTINCT LOWER(env_name)) = 2
ORDER BY author;
```

> **Cost Analysis**
>
> 500k rows is small, but LOWER(env_name) in WHERE kills any plain index on env_name. For an hourly job, add a functional index on LOWER(env_name).

> **Interviewers Watch For**
>
> Whether you reach for LOWER unprompted, and whether you use COUNT DISTINCT vs COUNT(*). Bonus for noting env_count is always 2 post-HAVING.

> **Common Pitfall**
>
> HAVING COUNT(*) = 2 passes a clean fixture and breaks the first time someone deploys to dev twice. Always DISTINCT when you mean 'how many categories'.

> **The False Start**
>
> First instinct is `WHERE env_name IN ('dev','production')` with `HAVING COUNT(*) = 2`. That misses 'Dev' rows and lets two dev deploys falsely qualify. Pivot to LOWER() in the filter and COUNT DISTINCT inside HAVING.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- Extend this to dev, staging, AND production? _(Swap the IN list to three values and HAVING to = 3. Shape generalizes cleanly.)_
- What if env_name has trailing whitespace from a CSV ingest? _(Wrap in TRIM(LOWER(env_name)). Better: clean at ingest so downstream queries stop paying the cost.)_
- Find authors who deployed to dev but NOT production? _(Conditional aggregation: HAVING SUM(CASE WHEN LOWER(env_name)='dev' THEN 1 ELSE 0 END) > 0 AND SUM(CASE WHEN LOWER(env_name)='production' THEN 1 ELSE 0 END) = 0.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/authors_deploying_to_dev_and_production)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.