# Rolling Weekly Total

> Seven days at a time, the totals keep rolling forward.

Canonical URL: <https://datadriven.io/problems/rolling_weekly_total>

Domain: SQL · Difficulty: medium · Seniority: L5

## Problem

A fraud detection model monitors spending velocity. Return every transaction along with a rolling 7-day spending total for the same user, ordered by transaction date. Include user_id, transaction_date, total_amount, and the rolling sum.

## Worked solution and explanation

### Why this problem exists in real interviews

This challenge asks you to apply custom window frame specification to the `transactions` table, simulating a real user behavior workflow. Pay attention to columns like `user_id`, `total_amount`, and `transaction_date` as they drive the aggregation and output.

> **Trick to Solving**
>
> Rolling or sliding window problems require an explicit frame clause. The default frame is rarely what you want.
> 
> 1. Identify the window size from the prompt (e.g., '3-month rolling')
> 2. Use `ROWS BETWEEN N PRECEDING AND CURRENT ROW`
> 3. Partition by the grouping key, order by the time column

---

### Break down the requirements

#### Step 1: Compute the running aggregate

The window function computes an aggregate across an ordered set of rows without collapsing them. Each row retains its detail while gaining the cumulative metric.

#### Step 2: Sort the final output

The `ORDER BY` clause ensures the result appears in the expected sequence. Interviewers check that the sort direction matches the prompt.

---

### The solution

**Sliding-window for rolling weekly total**

```sql
SELECT user_id, transaction_date, total_amount, SUM(total_amount) OVER (PARTITION BY user_id ORDER BY transaction_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_sum
FROM transactions
ORDER BY user_id, transaction_date
```

> **Cost Analysis**
>
> With ~120M rows, the window function runs on the reduced set after filtering and grouping. An index on the filter/join columns would reduce the scan to a seek.

> **Interviewers Watch For**
>
> Interviewers watch for whether you explicitly define the window frame or rely on defaults that may not match the requirement.

> **Common Pitfall**
>
> Omitting the explicit frame clause (`ROWS BETWEEN ...`) relies on the default, which is `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW` and may not match the intent.

---

## Common follow-up questions

- What would happen to your result if `transactions.transaction_id` contained duplicate values that you did not expect? _(Tests whether the candidate considers data quality issues in `transaction_id` and uses DISTINCT or deduplication where needed.)_
- If `transactions` grew to contain billions of rows, which part of your query would become the bottleneck given the cardinality of `transaction_id`? _(Tests ability to identify performance hotspots related to `transactions.transaction_id` at scale.)_
- How would you modify this query if the business logic required grouping by both `transaction_id` and `user_id` instead of just one? _(Tests ability to adapt the query structure to changing requirements.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/rolling_weekly_total)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.