# Slow Production Deploys

> Production deploys that took way too long.

Canonical URL: <https://datadriven.io/problems/slow_production_deploys>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

Find production deployments that took longer than 150 seconds. Show the service name, version, duration in seconds, and deployment timestamp, listed reverse-alphabetically by service name.

## Worked solution and explanation

### Why this problem exists in real interviews

Interviewers use this deployment scenario to test filtering and projection against the `deploy_logs` table. The focus is on how you handle columns like `svc_name`, `version`, and `env_name` when building the result.

---

### Break down the requirements

#### Step 1: Apply the range filter

The WHERE clause restricts rows to the target range. Applying this filter early reduces the volume flowing into downstream operations.

#### Step 2: Sort the final output

The `ORDER BY` clause ensures the result appears in the expected sequence. Interviewers check that the sort direction matches the prompt.

---

### The solution

**Apply the range filter to find slow production deploys**

```sql
SELECT svc_name, version, dur_secs, deploy_at
FROM deploy_logs
WHERE env_name = 'production' AND dur_secs > 150
ORDER BY dur_secs DESC
```

> **Cost Analysis**
>
> With ~600K rows, the query performs a single sequential scan. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether the query returns exactly the columns and ordering the prompt specifies; how quickly you identify the core operation and write clean, minimal code.

> **Common Pitfall**
>
> Returning extra columns that the prompt did not ask for, or using the wrong column alias, causes a grading mismatch even when the logic is correct.

---

## Common follow-up questions

- The `dur_secs` column in `deploy_logs` has a 1% null rate. How does your query handle rows where `dur_secs` is NULL, and could that silently change the result count? _(Tests whether the candidate accounts for NULLs in `deploy_logs.dur_secs` and understands how aggregates skip NULL values.)_
- If `deploy_logs` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `deploy_at`? _(Tests ability to identify performance hotspots related to `deploy_logs.deploy_at` at scale.)_
- `deploy_logs.status` only has 4 distinct values. If a new category were added, would your query automatically include it? _(Tests whether the query hard-codes values or dynamically adapts to `status` changes.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/slow_production_deploys)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.