# Mobile Event Counts

> Mobile engagement, device by device.

Canonical URL: <https://datadriven.io/problems/mobile_event_counts>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

For events tagged as 'mobile', count the number of events per event type, from most frequent to least.

## Worked solution and explanation

### Why this problem exists in real interviews

Extracting insights from event_data.event_type grouped by event_timestamp via grouping and pattern matching is the central task. It is used as a fundamentals check to test whether you pick the right aggregation function and partition boundary on the first attempt.

---

### Break down the requirements

#### Step 1: Filter to the target rows

Apply the `LIKE` pattern match in the `WHERE` clause. This narrows the dataset before any grouping or aggregation.

#### Step 2: Aggregate with COUNT

Group by the output grain and apply `COUNT()` to compute the metric. The `GROUP BY` must match exactly what the output needs: one row per group key.

#### Step 3: Order the final output

Apply `ORDER BY` as specified to produce the expected row sequence. When tied values exist, add a secondary sort column for determinism.

---

### The solution

**Tag-based filter with group count**

```sql
SELECT event_type, COUNT(*) AS event_count
FROM event_data
WHERE tags LIKE '%mobile%'
GROUP BY event_type
ORDER BY event_count DESC
```

> **Cost Analysis**
>
> The query scans 150M rows from `event_data`. The aggregation reduces the row count before any downstream processing, which is the key performance lever.

> **Interviewers Watch For**
>
> Naming the output grain ("one row per X") before writing the GROUP BY shows you think about data shape, not just syntax.

> **Common Pitfall**
>
> Returning more columns than the prompt asks for can trigger a "wrong schema" failure in automated grading. Match the output specification exactly.

---

## Common follow-up questions

- What happens to your result if event_data.payload contains NULLs for some rows? _(Tests whether the candidate accounts for NULL behavior in aggregates and comparisons on payload.)_
- How would you verify that your aggregation on event_data.event_id is not double-counting due to duplicate rows? _(Tests data quality awareness and deduplication strategies.)_
- With millions of distinct values in event_data.event_id, what index strategy would you use to keep this query performant? _(Tests indexing knowledge specific to high-cardinality columns like event_id.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/mobile_event_counts)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.