# Single Service Owners

> One owner, one service. Nobody else.

Canonical URL: <https://datadriven.io/problems/single_service_owners>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

An infrastructure audit needs to identify owners responsible for exactly one service. Show the service name and how many services that owner manages (which will be 1), listed alphabetically by service name. Limit to the top 10.

## Worked solution and explanation

### Why this problem exists in real interviews

This challenge asks you to apply HAVING for post-aggregation filtering to the `svc_health` table, simulating a real analytics workflow. Pay attention to the `svc_name` column as they drive the aggregation and output.

---

### Break down the requirements

#### Step 1: Aggregate by `svc_name`

`GROUP BY svc_name` collapses rows to one per group. The aggregate functions (`SUM`, `COUNT`, `AVG`, etc.) compute the metric for each group.

#### Step 2: Filter groups with HAVING

HAVING applies after GROUP BY, filtering out groups that do not meet the threshold. This cannot be done in WHERE because the aggregate has not been computed yet.

#### Step 3: Deduplicate the result with DISTINCT

`SELECT DISTINCT` removes duplicate rows from the output. This is necessary when joins or subqueries can produce repeated combinations.

#### Step 4: Order and limit the output

`ORDER BY` with `LIMIT` returns only the top result. The sort must be deterministic; add a tiebreaker column if needed.

---

### The solution

**Having filter for single service owners**

```sql
SELECT svc_name, CAST(COUNT(DISTINCT svc_name) AS DOUBLE) AS service_count
FROM svc_health
GROUP BY svc_name
HAVING COUNT(DISTINCT svc_name) = 1
ORDER BY svc_name
LIMIT 10
```

> **Cost Analysis**
>
> With ~25M rows, the GROUP BY reduces the working set before any downstream operations. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether you use HAVING (not WHERE) to filter after aggregation; whether you know when DISTINCT is needed and when it masks a logic error.

> **Common Pitfall**
>
> Putting the aggregate condition in WHERE instead of HAVING causes a syntax error. WHERE runs before GROUP BY; HAVING runs after.

---

## Common follow-up questions

- If `latency` in `svc_health` is NULL for some rows, how would your aggregation or join logic be affected? _(Probes understanding of NULL propagation through joins and aggregate functions on `svc_health.latency`.)_
- With 350,000 distinct values in `svc_health.latency`, how would a composite index on the GROUP BY columns change the execution plan? _(Probes understanding of how cardinality in `latency` affects grouping and sort operations.)_
- If the HAVING threshold in your query changed from a fixed number to a percentile, how would you restructure the query? _(Tests ability to replace static HAVING filters with dynamic subquery-based thresholds.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/single_service_owners)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.