# Total Hours Between Consecutive Events

> Hours between state changes.

Canonical URL: <https://datadriven.io/problems/total_hours_between_consecutive_events>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

Our pipeline tracks user events with timestamps. For each event type, calculate the total hours elapsed between consecutive events of the same type.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests `LAG` window functions combined with timestamp arithmetic. Interviewers probe whether you can compute inter-event intervals and aggregate them per group, which is a core pattern in event stream analysis.

> **Trick to Solving**
>
> "Consecutive events of the same type" signals partitioned `LAG`. The trick is to compute pairwise differences using `LAG`, then sum those differences per event type.
> 
> 1. Use `LAG(event_timestamp) OVER (PARTITION BY event_type ORDER BY event_timestamp)` to get the previous timestamp
> 2. Compute the difference in hours for each row
> 3. Sum the differences per event type

---

### Break down the requirements

#### Step 1: Get previous event timestamp per type

`LAG(event_timestamp) OVER (PARTITION BY event_type ORDER BY event_timestamp)` places the prior event's timestamp on each row.

#### Step 2: Compute hours between consecutive events

Use `(JULIANDAY(event_timestamp) - JULIANDAY(prev_ts)) * 24` to convert the day-based difference to hours.

#### Step 3: Aggregate total hours per event type

`GROUP BY event_type` with `SUM(hours_diff)` produces the total elapsed hours between all consecutive events of each type.

---

### The solution

**LAG for inter-event gaps then aggregate**

```sql
SELECT event_type, SUM(hours_diff) AS total_hours
FROM (
    SELECT
        event_type,
        (JULIANDAY(event_timestamp) - JULIANDAY(
            LAG(event_timestamp) OVER (
                PARTITION BY event_type
                ORDER BY event_timestamp
            )
        )) * 24 AS hours_diff
    FROM event_data
) gaps
WHERE hours_diff IS NOT NULL
GROUP BY event_type
```

> **Cost Analysis**
>
> The window function sorts 250M rows within 25 partitions (10M rows per partition on average). This is a heavy operation. Partitioning the table by `event_type` would parallelize the sort. The outer aggregation is cheap since it operates on the same row count.

> **Interviewers Watch For**
>
> Correctly excluding the first event per type (where LAG returns NULL). The `WHERE hours_diff IS NOT NULL` filter handles this. Candidates who forget this will get inflated or erroneous sums.

> **Common Pitfall**
>
> Using `LEAD` instead of `LAG` reverses the direction: you would compute time to the next event instead of time since the previous one. Both are valid depending on the definition, but the prompt says "between consecutive events," which is symmetric.

---

## Common follow-up questions

- How would you compute the average gap instead of the total? _(Replace SUM with AVG, or divide SUM by COUNT of non-null gaps.)_
- What if you needed gaps between consecutive events per user, not per type? _(Change PARTITION BY to user_id instead of event_type.)_
- How would you find the event type with the longest single gap? _(Tests MAX of the gap column with a GROUP BY, or a window-based approach.)_
- What if timestamps were not unique within an event type? _(Tie-breaking with event_id in the ORDER BY prevents non-deterministic LAG results.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/total_hours_between_consecutive_events)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.