# The Full Picture

> Two tables know different things about the same people. Combine them.

Canonical URL: <https://datadriven.io/problems/joined_employee_details>

Domain: SQL · Difficulty: easy · Seniority: L5

## Problem

The people analytics team needs every employee's name alongside their total metric value from employee_metrics, matched on department. Employees whose department has no metrics recorded yet should still appear with a zero, not be dropped from the results.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests whether you can aggregate to the correct grain before filtering or ordering, and join semantics and their effect on result cardinality. Though the query is straightforward, interviewers use it at L5+ as a quick filter: candidates who stumble on the basics rarely recover in harder rounds.

> **Trick to Solving**
>
> NULL propagation silently corrupts aggregates and joins. The trick is explicit handling.
> 
> 1. Identify which columns can be NULL (check the schema)
> 2. Use `COALESCE(col, 0)` for numeric defaults or `LEFT JOIN` to preserve unmatched rows
> 3. Verify that your WHERE clause does not accidentally filter out NULLs

---

### Break down the requirements

#### Step 1: Join tables with LEFT JOIN

Connect `employees` and `employee_metrics` on `department` to preserve all rows from the primary table.

#### Step 2: Aggregate with SUM

Group by the output grain and apply `SUM()` to compute the metric. The `GROUP BY` must match exactly what the output needs: one row per group key.

#### Step 3: Handle NULLs with COALESCE

`COALESCE(expr, 0)` replaces NULL with a default value. This prevents NULL propagation from silently corrupting sums, counts, or display values.

---

### The solution

**LEFT JOIN with COALESCE zero-fill**

```sql
SELECT e.emp_name, COALESCE(SUM(m.metric_value), 0) AS total_metric_value
FROM employees e
LEFT JOIN employee_metrics m ON e.department = m.department
GROUP BY e.employee_id, e.emp_name
```

> **Cost Analysis**
>
> The join touches `employees` (80K rows) and `employee_metrics` (10K rows). CTEs in most engines are optimization fences. For production workloads, consider inlining or materializing the intermediate results.

> **Interviewers Watch For**
>
> Interviewers expect you to articulate why you chose a specific join type and what happens to unmatched rows. Explicitly mentioning NULL handling before being asked signals production awareness.

> **Common Pitfall**
>
> Forgetting that a JOIN can multiply rows when the relationship is one-to-many. Always check whether the join key is unique on at least one side.

---

## Common follow-up questions

- What happens if the input table is empty? _(Tests whether the candidate considers the zero-row edge case.)_
- What if a row in the left table has no match in the right table? _(Tests understanding of LEFT vs INNER join behavior.)_
- What does your query return if every row has the same group key? _(Tests understanding of aggregation with a single group.)_
- How would you add a secondary sort to make the output deterministic? _(Tests awareness of non-deterministic ordering when values are tied.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/joined_employee_details)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.