# The Month-by-Month Snapshot

> Every salesperson has a story. The months just tell it sideways.

Canonical URL: <https://datadriven.io/problems/the_month_by_month_snapshot>

Domain: Python · Difficulty: medium · Seniority: L4

## Problem

Given a list of sales records (each a dict with 'employee_id', 'month', 'sales_amount'), return a list of per-employee dicts. Each dict has 'employee_id' plus keys named after each month (as seen in the input) holding the total sales for that month for that employee. Employees are sorted alphabetically by employee_id.

## Worked solution and explanation

### Why this problem exists in real interviews

This tests **data pivoting**, a core skill in analytics and reporting. Interviewers check whether candidates can reshape row-oriented data into a column-oriented structure using dictionary accumulation.

---

### Break down the requirements

#### Step 1: Group records by employee

Use a dictionary keyed by employee identifier to accumulate their data.

#### Step 2: Create month-based keys for each employee

For each record, add the sales amount to a key named after the month within the employee's entry.

#### Step 3: Handle multiple records for the same employee and month

Sum the amounts when an employee has multiple sales in the same month.

---

### The solution

**Dict-of-dicts pivot with month keys**

```python
def pivot_sales(records):
    employees = {}
    for record in records:
        emp = record['employee']
        month = record['month']
        amount = record['amount']
        if emp not in employees:
            employees[emp] = {'employee': emp}
        if month in employees[emp]:
            employees[emp][month] += amount
        else:
            employees[emp][month] = amount
    result = []
    for emp in employees:
        result.append(employees[emp])
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the number of records. Each record is processed once.
> 
> **Space:** O(e * m) where e is the number of employees and m is the number of distinct months.

> **Interviewers Watch For**
>
> Do you handle the accumulation correctly when the same employee-month pair appears multiple times? Using `+=` instead of `=` is the critical detail.

> **Common Pitfall**
>
> Overwriting instead of summing when a month key already exists. This silently drops earlier records for the same employee and month.

---

## Common follow-up questions

- What if months should appear in calendar order? _(Tests ordering dict keys by a custom sort based on month names.)_
- How would you add a total column summing all months per employee? _(Tests a second pass to aggregate across the pivoted keys.)_
- What if there were thousands of employees and you needed this to be memory-efficient? _(Tests streaming or chunked processing strategies.)_
- How would you handle this in SQL instead? _(Tests PIVOT or conditional aggregation with CASE WHEN.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_month_by_month_snapshot)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.