# Clean Cache CDN Edges

> Cached, clean, error-free edges.

Canonical URL: <https://datadriven.io/problems/clean_cache_cdn_edges>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

Which CDN edge locations are serving cached content cleanly? Find edges that had a cache hit with a successful HTTP status (below 400). Return only unique edge locations.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests multi-condition filtering with DISTINCT. It verifies that you can combine boolean and numeric conditions and deduplicate results correctly.

---

### Break down the requirements

#### Step 1: Filter for cache hits with success status

`WHERE cache_hit = 1 AND status < 400` matches successful cache-hit responses.

#### Step 2: Deduplicate edge locations

`SELECT DISTINCT edge_loc` returns only unique edge locations.

---

### The solution

**Multi-condition filter with deduplication**

```sql
SELECT DISTINCT edge_loc
FROM cdn_logs
WHERE cache_hit = 1
  AND status < 400
```

> **Cost Analysis**
>
> Full scan of 300M rows with two filter conditions. The DISTINCT collapses output to the number of unique edge locations (typically dozens). An index on `(cache_hit, status)` would help, but the scan still dominates.

> **Common Pitfall**
>
> Using `status = 200` instead of `status < 400` would miss other successful status codes like 201 or 304. The prompt says "successful HTTP status (below 400)" which includes the entire 2xx and 3xx range.

---

## Common follow-up questions

- What if cache_hit is a boolean instead of 0/1? _(Tests type-aware filtering: WHERE cache_hit = true vs cache_hit = 1.)_
- How would you also count the requests per edge location? _(Replace DISTINCT with GROUP BY and add COUNT(*).)_
- What status codes fall in the 3xx range and should they count as successful? _(3xx are redirects; whether they count as 'successful' depends on business context.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/clean_cache_cdn_edges)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.