# Transaction Revenue by Customer

> One month, every customer, every dollar accounted for.

Canonical URL: <https://datadriven.io/problems/transaction_revenue_by_customer>

Domain: SQL · Difficulty: medium · Seniority: L3

## Problem

Compute March 2026 transaction totals per customer. Only include users who made at least one transaction that month, sorted from highest spender to lowest.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests date-range filtering combined with aggregation and ordering. Interviewers check whether you can correctly extract month boundaries and handle the implicit "at least one transaction" requirement through an INNER aggregation.

---

### Break down the requirements

#### Step 1: Filter to March of the target year

Use `WHERE transaction_date >= '2026-03-01' AND transaction_date < '2026-04-01'` for an efficient range scan that captures all March dates.

#### Step 2: Aggregate per customer

`GROUP BY user_id` with `SUM(total_amount)` computes the March total per user. The GROUP BY inherently filters to users with at least one transaction.

#### Step 3: Sort highest to lowest

`ORDER BY total_revenue DESC` surfaces the biggest spenders first.

---

### The solution

**Date-range filter with per-customer aggregation**

```sql
SELECT user_id, SUM(total_amount) AS total_revenue
FROM transactions
WHERE transaction_date >= '2026-03-01'
  AND transaction_date < '2026-04-01'
GROUP BY user_id
ORDER BY total_revenue DESC
```

> **Cost Analysis**
>
> The date filter narrows 100M rows to ~8.2M (1 of ~12 months). An index on `(transaction_date, user_id, total_amount)` enables a covering range scan.

> **Interviewers Watch For**
>
> Using half-open intervals (`>=` and `<`) for date ranges instead of `BETWEEN`, which includes the end boundary and can cause off-by-one errors with timestamps.

> **Common Pitfall**
>
> Using `LIKE '2026-03%'` for date filtering. While it works for TEXT dates, it prevents index range scans and is fragile if the date format changes.

---

## Common follow-up questions

- How would you modify this to show revenue by month for the entire year? _(Tests adding date truncation to the GROUP BY for a time-series pivot.)_
- What if you needed to include customers with zero March transactions? _(Requires a LEFT JOIN from a users table.)_
- What if transaction_date included timestamps with time components? _(The half-open interval approach still works correctly for timestamps.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/transaction_revenue_by_customer)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.