# Transaction Timeline

> First purchase to last. The full spending arc.

Canonical URL: <https://datadriven.io/problems/transaction_timeline>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

Finance wants a one-row-per-customer purchase snapshot for the lifetime-value model. For each user with at least one transaction, find their earliest and latest transaction_date and their total spending. Return the username and those three aggregates.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests joining two tables with aggregation to produce a per-entity summary. Interviewers check whether you correctly use `MIN`, `MAX`, and `SUM` together and handle the join with the right cardinality.

---

### Break down the requirements

#### Step 1: Join users to transactions

`JOIN transactions ON users.user_id = transactions.user_id` matches users with their purchase history. The INNER JOIN inherently filters to users with at least one transaction.

#### Step 2: Aggregate per user

`GROUP BY u.user_id, u.username` with `MIN(transaction_date)`, `MAX(transaction_date)`, and `SUM(CAST(total_amount AS DOUBLE))` produces the earliest date, latest date, and total spend.

---

### The solution

**Join and aggregate for lifetime value summary**

```sql
SELECT
    u.username,
    MIN(t.transaction_date) AS first_transaction,
    MAX(t.transaction_date) AS last_transaction,
    SUM(CAST(t.total_amount AS DOUBLE)) AS total_spend
FROM users u
JOIN transactions t ON u.user_id = t.user_id
GROUP BY u.user_id, u.username
```

> **Cost Analysis**
>
> The join produces 80M rows (all transactions matched to users). The GROUP BY reduces to ~4M users. Casting `total_amount` from TEXT to DOUBLE on 80M rows adds overhead; a numeric column type would be more efficient.

> **Interviewers Watch For**
>
> Whether you cast TEXT columns to numeric types for the SUM. With TEXT-typed `total_amount`, string concatenation or errors could occur without explicit casting.

> **Common Pitfall**
>
> Using LEFT JOIN when the prompt says "for each user with at least one transaction." LEFT JOIN would include users with no transactions, producing NULLs for the aggregates.

---

## Common follow-up questions

- How would you compute the number of days between first and last transaction? _(Tests date arithmetic: JULIANDAY(MAX(...)) - JULIANDAY(MIN(...)).)_
- What if you needed to include users with zero transactions? _(Switch to LEFT JOIN and use COALESCE for the aggregates.)_
- How would you add average spend per transaction? _(Add AVG(CAST(total_amount AS DOUBLE)) or divide SUM by COUNT.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/transaction_timeline)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.