# Platform Check

> OS and device combos. Which sessions last longest?

Canonical URL: <https://datadriven.io/problems/platform_check>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The QA team is profiling performance regressions and wants to see which device and operating system combinations produce the longest user sessions. For each session, show the user ID, device type, operating system, and session duration. Only include sessions that lasted longer than one thousand seconds.

## Worked solution and explanation

### Why this problem exists in real interviews

This session analysis problem uses the `user_sessions` and `devices` tables to evaluate self-join. Watch how columns like `device_id`, `device_type`, and `os_name` interact in the grouping and filtering logic.

---

### Break down the requirements

#### Step 1: Join sessions to devices

`JOIN devices ON user_sessions.device_id = devices.device_id` brings in `device_type` and `os_name`.

#### Step 2: Filter to long sessions

`WHERE session_duration_sec > 1000` keeps only sessions exceeding 1,000 seconds.

#### Step 3: Return the requested columns

Select `user_id`, `device_type`, `os_name`, and `session_duration_sec`.

---

### The solution

**Join sessions to devices to find platform check**

```sql
SELECT s.user_id, d.device_type, d.os_name, s.session_duration_sec
FROM user_sessions s
JOIN devices d ON s.device_id = d.device_id
WHERE s.session_duration_sec > 1000
```

> **Cost Analysis**
>
> With `user_sessions` (20,000,000 rows), `devices` (2,000,000 rows), the full scan reads significant data. A composite index on the filter columns pushes the predicate into the index layer. Pre-aggregation in a materialized view is worth considering at this scale.

> **Interviewers Watch For**
>
> Interviewers evaluate whether you translate the English requirements into the correct SQL clauses on the first attempt. They watch for clean syntax, correct column references, and whether you verify edge cases before declaring the query complete.

> **Common Pitfall**
>
> The most common mistake is misreading the prompt's filtering or grouping requirements. Double-check which columns to group by, which to aggregate, and whether the output should be filtered with `WHERE` (before grouping) or `HAVING` (after grouping).

---

## Common follow-up questions

- The `os_version` column in `devices` has a 10% null rate. How does your query handle rows where `os_version` is NULL, and could that silently change the result count? _(Tests whether the candidate accounts for NULLs in `devices.os_version` and understands how aggregates skip NULL values.)_
- If `user_sessions` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `session_start`? _(Tests ability to identify performance hotspots related to `user_sessions.session_start` at scale.)_
- If this query ran as a scheduled job, how would you add monitoring to detect when the result set is suspiciously empty? _(Tests operational awareness around scheduled query jobs.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/platform_check)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.