# Every Device, Every Impression

> Every ad seen. Every second watched. Real-time.

Canonical URL: <https://datadriven.io/problems/every_device_every_impression>

Domain: Pipeline Design · Difficulty: hard · Seniority: L6

## Problem

Our platform streams content to millions of devices and our business depends on accurate ad impression data for billing and reliable viewing event data for content performance analytics. These two consumers have different latency requirements and tolerance for approximation. Design the end-to-end pipeline for device telemetry ingestion and the data model that serves both.

## Worked solution and explanation

### Why this problem exists in real interviews

Two consumers (billing and analytics) reading the same device telemetry with opposite correctness budgets, plus a partial-credit rule that needs paired start-and-complete events, plus late-arrival tolerance. The trap is one streaming aggregator that approximates for analytics and tries to also bill against , billing needs exact start/complete pairs and reconciliation against the ad server, not running aggregates.

The default reach is one stream that counts events as they arrive and serves both billing and analytics. The first daily reconciliation against the ad server's record disagrees by enough to block payment. The partial-credit rule applies inconsistently because raw event counts don't pair starts and completes. Late events from offline devices hit the wrong day in both views; advertisers and content teams both end up disagreeing with what the team ships.

> **Trick to Solving**
>
> Billing path with paired-event state and exact dedup, analytics on a slower batch, late events credited to event-time periods, reconciliation against the ad server gates the billing publish.
> 
> 1. A stateful streaming consumer pairs start and complete events per impression id; a complete within the threshold counts as full credit, an unmatched start past the window counts as partial.
> 2. Dedup on impression id at ingest before counting; retries collapse to one billable impression.
> 3. Reconciliation against the ad server's record runs daily; the billing publish gates on the diff being inside contractual tolerance.
> 4. Both billing and analytics use event-time partitioning; late events land in the period they belong to.

---

### Walk the requirements

#### Step 1: Billing reconciles within contractual tolerance against the ad server

Billing-grade impression counts come from a streaming consumer that pairs events and dedups by impression id, with the daily total reconciled against the ad server's record before the publish. If the diff is inside contractual tolerance, the publish proceeds; if not, the publish halts and the team investigates. Without a streaming path the billing-grade counts can't be ready by the daily window; without a durable archive the reconciliation has nothing to compare against.

#### Step 2: Each impression billed once, on a stable id

Every impression carries a stable id from the ad server. The streaming consumer dedups on impression id; idempotent writes downstream mean retries from the same id collapse to one billable count. A retried event from a device that lost connectivity briefly doesn't double-bill the advertiser. Counting raw events is the version that inflates billing on every retry; dedup at the boundary is the contract.

#### Step 3: Pair start and complete events; apply partial credit consistently

An impression starts and completes; a stateful streaming aggregator keyed on impression id holds the start and waits for the complete. A complete inside the qualifying window emits full credit; an unmatched start past the window emits partial credit. The rule applies in code, not in spreadsheets. Counting raw event totals is the version that bills full credit for impressions that didn't actually qualify; the paired state is what makes the partial-credit rule honest.

#### Step 4: Late offline events credit the period they happened in

Devices in low-connectivity locations buffer events. Both billing and analytics use event-time partitioning; late events landing inside the agreed lateness window update the period they belong to. Billing's prior day reopens for late events that arrive within the window and the next reconciliation accounts for them; analytics' prior day's rollup updates the same way. A 'arrival-time bucketing' design is the version where late offline events inflate today and miss the day they actually happened.

---

### The shape that fits

> **What this design gives up**
>
> Stateful pairing per impression costs memory proportional to in-flight impressions; daily reconciliation halts the billing publish on partner mismatches; late-event windows hold open older periods. Implementation cost is the price; the win is billing that reconciles with partners, partial credit applied consistently, and analytics that credits the right period.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - A streaming path produces billing-grade impression counts; a batch path produces content analytics on a slower cadence.
> - A stateful pairing of start and complete events per impression applies the partial-credit rule consistently.
> - Dedup on impression id collapses device retries.
> - A daily reconciliation against the ad server gates the billing publish; both views credit late events to their event-time period.

> **The mistake that ships**
>
> What gets shipped runs one streaming aggregator counting raw events and serves both billing and analytics from it. The daily reconciliation against the ad server fails on the first day because retries inflated counts and the partial-credit rule was never applied. Late events from offline devices end up in the wrong day in both views. Billing partners block payment; content analytics shows numbers that don't match what users actually watched. The rebuild adds the paired-event state, the impression-id dedup, the reconciliation gate, and event-time partitioning.

---

## Common follow-up questions

- An impression's complete event arrives just outside the qualifying window. What does this design do, and what does billing show? _(Tests whether the candidate sees the qualifying-window threshold as the partial-credit boundary: the complete arriving past the window doesn't promote the impression to full credit; billing shows partial credit consistently. The threshold is in shared code so the same rule applies every time.)_
- Reconciliation flags a divergence between billing and the ad server. What does the design surface, and how does the team triage? _(Tests whether the candidate sees the gate publishing the per-period diff with detail (counts, missing impressions, extra impressions); the team triages whether the gap is a missing event, a paired-event timing issue, or an upstream ad-server bug. The publish stays held until the diff is inside tolerance or explained.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/every_device_every_impression)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.