# Average Session Duration by Device

> Session length, device by device.

Canonical URL: <https://datadriven.io/problems/average_session_duration_by_device>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

Mobile users report shorter, more frustrating sessions than desktop users, and the product team wants data to back that up. Show the average session duration broken down by device type.

## Worked solution and explanation

### Why this problem exists in real interviews

This appears on interview slates because grouped aggregation on `user_sessions`, `devices` exposes whether a candidate thinks about edge cases in `session_start`, `session_duration_sec`, `pages_viewed` or just writes mechanical queries.

---

### Break down the requirements

#### Step 1: Join `user_sessions` with `devices`

Use `JOIN` on `device_id` to link fact rows to the dimension table. INNER JOIN keeps only matched rows.

#### Step 2: Group by `os_name`

`GROUP BY b.os_name` produces one row per distinct value of the grouping column.

#### Step 3: Aggregate with AVG

`AVG(session_duration_sec)` computes the metric at the grouped grain.

#### Step 4: Order by the metric

Sort descending to surface the top values first.

---

### The solution

**Group-aggregate for average session duration device**

```sql
SELECT
    b.os_name,
    AVG(session_duration_sec) AS avg_session_duration_sec
FROM user_sessions a
JOIN devices b ON a.device_id = b.device_id
GROUP BY b.os_name
ORDER BY avg_session_duration_sec DESC
```

> **Cost Analysis**
>
> The main table has 50M rows (19 GB). Partitioned on `session_start`, so queries filtering on that column skip most partitions. The GROUP BY reduces the row count early, keeping downstream operations cheap.

> **Interviewers Watch For**
>
> Strong candidates state the correct `GROUP BY` grain before writing any SQL, showing they think about the output shape first. Join type selection (INNER vs LEFT) reveals whether the candidate read the requirements carefully.

> **Common Pitfall**
>
> Selecting a non-aggregated column without including it in `GROUP BY` is the most common error. Some engines reject it; others silently return arbitrary values.

---

## Common follow-up questions

- The `browser` column in `devices` has roughly 35% NULLs. How does your query handle those rows, and would the result change if NULLs were replaced with zeros? _(Tests whether the candidate understands how NULLs propagate through aggregation functions and whether their WHERE/JOIN conditions implicitly filter them out.)_
- If `user_sessions` and `devices` have a one-to-many relationship, how does that affect the COUNT in your GROUP BY? _(Tests understanding of fan-out: joining before grouping can inflate counts.)_
- `session_id` in `user_sessions` has ~50M distinct values. What index strategy keeps your query from doing a full table scan? _(Tests whether the candidate can design indexes for high-cardinality columns and understands selectivity.)_
- If the business definition of `device_id` changed mid-quarter (e.g., a status value was renamed), how would you handle historical consistency? _(Tests awareness of slowly changing dimensions and backward-compatible query design.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/average_session_duration_by_device)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.