# The API Drip Feed

> The API gives you 100 records at a time. You need millions.

Canonical URL: <https://datadriven.io/problems/the_api_drip_feed>

Domain: Pipeline Design · Difficulty: medium · Seniority: L5

## Problem

We need to pull data from a third-party project management SaaS API - tasks, users, and projects - and land it in our data warehouse for analytics. The API doesn't push events; we have to poll it. Design a connector that keeps our warehouse in sync incrementally, handles schema changes as the vendor evolves their API, and stays within rate limits.

## Worked solution and explanation

### Why this problem exists in real interviews

A poll-based connector for a SaaS API across many tenants, with three properties that conflict if you treat them in isolation: per-tenant rate limits, mixed freshness needs, and per-tenant isolation. The trap is one shared puller that hits the cap on the noisiest tenant and blocks everyone.

The default reach is one connector that paginates through every tenant's data on a shared schedule. The cap on a noisy tenant gets hit and the connector throttles globally; quiet tenants get blocked too. The dashboard that needs hours-fresh data waits for the daily run because nothing's tiered. An auth issue on one tenant fails the whole run.

> **Trick to Solving**
>
> Per-tenant runs paced under each tenant's cap, two cadences (daily for most, hourly for the tracked dashboard), one tenant's failure stays local.
> 
> 1. Per-tenant runs in the orchestrator with each tenant's own cursor and rate-limit budget. One tenant's cap doesn't propagate.
> 2. Two cadences off the same connector: daily incremental for most analytics, hourly run for the dashboard's tier.
> 3. Auth or rate-limit failures alert per tenant; healthy tenants keep flowing.

---

### Walk the requirements

#### Step 1: Two cadences match each consumer's freshness need

Most analytics tolerates T+1 and runs on a daily incremental schedule; the tracked dashboard's tier runs hourly off the same connector. The connector's per-tenant logic is the same; the orchestrator schedules the dashboard tier separately so the daily tier doesn't pay for the extra runs. Forcing every tenant onto hourly wastes API budget; forcing the dashboard tier onto daily misses its window.

#### Step 2: Per-tenant rate-limit pacing keeps the connector under the cap

The vendor caps the API per tenant. Each tenant's run paces requests under its cap with backoff on 429s. The orchestrator's state captures partial progress so a backoff doesn't restart the whole run. A 'page as fast as possible' approach is the version where the cap kicks in and the connector throttles globally; per-tenant pacing keeps each tenant's run inside its budget.

#### Step 3: Per-tenant isolation so one failure doesn't block the others

Each tenant runs as its own task graph in the orchestrator with its own cursor, its own credentials, and its own alerting. An auth failure on one tenant fires an alert for that tenant; healthy tenants continue. A 'one big run for all tenants' approach is the version where one tenant's auth issue takes down everyone's morning sync; per-tenant tasks isolate the failure.

---

### The shape that fits

> **What this design gives up**
>
> Per-tenant tasks are wider DAGs and more orchestration config than one shared run; two cadences mean two schedules to operate; per-tenant pacing requires the connector to know each tenant's cap. Implementation cost is the price; the win is the dashboard's tier runs hourly, the daily tier runs without burning API budget on hourly pulls, and one tenant's failure stays its own.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - An orchestration layer schedules per-tenant runs with per-tenant cursors and rate-limit pacing.
> - Two cadences off the connector serve T+1 analytics and the hourly dashboard tier.
> - Per-tenant failure isolation: one tenant's rate-limit or auth issue doesn't block the others.
> - The synced data lands in the warehouse for analytics.

> **The mistake that ships**
>
> What gets shipped runs one connector that paginates through every tenant on a shared schedule. The cap kicks in on a noisy tenant and the connector throttles globally; quiet tenants don't sync. The dashboard tier waits for the daily run. An auth issue on one tenant takes down the morning sync. The eventual rebuild adds per-tenant tasks, per-tenant pacing, and tiered cadences.

---

## Common follow-up questions

- A new tenant onboards with three years of historical data to backfill. What in this design lets the backfill run without delaying the live tenants? _(Tests whether the candidate sees the backfill as a separate task on its own pacing, on a separate worker pool if needed, so the live per-tenant runs aren't affected. The orchestrator routes the backfill out of the main schedule's path.)_
- A vendor schema change breaks one tenant's pull but not the others. What does the design surface, and where? _(Tests whether the candidate sees per-tenant alerts as the surface: the affected tenant's run alerts on the schema mismatch with the diff visible; other tenants continue if they're unaffected, and the team updates the mapping for the affected tenant once.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_api_drip_feed)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.