# The Meal Kit That Knows You

> What they ordered says a lot about what they want next.

Canonical URL: <https://datadriven.io/problems/the_meal_kit_that_knows_you>

Domain: Pipeline Design · Difficulty: medium · Seniority: L6

## Problem

We're a meal-kit delivery company. We want to personalize which recipes we show each customer when they open the app. Design a pipeline that ingests data about orders and matches them to menus for recommendations.

## Worked solution and explanation

### Why this problem exists in real interviews

A recommendations pipeline that has to feel instant on app open, handle the substantial cold-start cohort, respect business rules that change after the model scores, and learn from the click-through it produces. The trap is the recommendations service that scores in real time without pre-computed features and without a post-scoring rule layer; the page hangs at peak, and sold-out items get recommended.

The default reach is a request-time service that joins user history to current menu and scores on the fly. The first peak hour shows the page hanging because the join is doing real work per request; first-time users see an empty list because there's no history to join; sold-out items appear in the list because the inventory check happens after the user notices. There's no impression / click log so the next training run has no feedback.

> **Trick to Solving**
>
> Pre-computed features for serving, a cold-start fallback, business rules after scoring, a logged impression-and-click feedback loop.
> 
> 1. Features are pre-computed in an online store sized for low-latency reads; the request-time service is a lookup plus a model call, not a feature computation.
> 2. Cold-start uses a fallback set the service falls back to when the user-feature lookup misses (popular items, category-based, or a curated launch list).
> 3. Business rules (sold-out, dietary, launch boost) apply after the model scores so the model focuses on relevance and the rules guarantee deliverability.
> 4. Every impression and the resulting click or order writes to a feedback log; the next training run reads from it.

---

### Walk the requirements

#### Step 1: Serve recommendations from pre-computed features, not request-time computation

When a customer opens the app, the recommendation service reads the user's features and the candidate items' features from the online store, the model scores the candidates, the rule layer filters and boosts, and the list returns. The feature computation already happened on the streaming and batch paths; the request-time work is a lookup plus inference. A 'compute features at request time' design is the version where the page hangs at peak because the join is doing real work per request.

#### Step 2: Cold-start fallback for first-time customers

About 15% of users each day are first-time customers with no order history. The recommendation service detects the missing user-feature lookup and falls back to a precomputed cold-start list (popular this week, dietary-respecting, with the launch boost applied). The fallback list itself is a feature row keyed on cohort or segment, refreshed on a slower batch. A 'no user, no list' design is the version where 15% of daily opens see nothing; the fallback is the contract that keeps every open useful.

#### Step 3: Business rules apply after the model scores

The model scores candidates for relevance to this user. Rules apply after: sold-out items drop out, dietary restrictions filter the list, new recipes get a launch boost. Putting rules before scoring forces the model to know about inventory and launches; putting them after keeps the model focused on relevance and the rules deterministic. The user never sees an unavailable item because the rule layer guards the boundary.

#### Step 4: Log every impression and click for the next training

Recommendations served and customer responses (click, no-click, order) write to a feedback log. The next training run reads from the log to learn which recommendations actually drove behavior; without it, the model never learns from production. A 'we'll add logging later' design is the version where 'why isn't the model improving' becomes a quarterly question with no answer; logging from day one is what closes the loop.

---

### The shape that fits

> **What this design gives up**
>
> Pre-computed features mean the streaming and batch paths have to keep the online store fresh; the cold-start fallback is a separate computation with its own refresh cadence; the rule layer adds request-time logic the model doesn't see; the feedback log grows linearly with impressions. Implementation cost is the price; the win is recommendations that load on app open, first-time customers who see something useful, sold-out items that never appear, and a model that gets better.

> **What reviewers check**
>
> A reviewer looks at the canvas for these properties:
> - Recommendations serve from pre-computed features in an online store with a request-budget read, not from a request-time feature computation.
> - Cold-start customers get a fallback recommendation set when their feature lookup misses.
> - Business rules apply after model scoring (filter sold-out, respect dietary, boost new recipes) before the list returns.
> - An impression-and-click feedback log persists for the next training run.

> **The mistake that ships**
>
> What gets shipped runs request-time scoring against the warehouse; the page hangs on the first peak open. First-time users see an empty list; the team adds a hardcoded 'top 10' that becomes the same list every day. Sold-out items appear because inventory check happens after the user notices; the customer adds a missing item to their cart. No impression log means the next training run is on the same data as the last and the model doesn't improve. The eventual rebuild adds the online store, the cold-start fallback, the post-scoring rule layer, and the feedback log.

---

## Common follow-up questions

- A new recipe launches with no historical engagement. How does this design make sure it gets shown enough to learn from, without flooding lists with low-quality recommendations? _(Tests whether the candidate sees the launch boost as a rule-layer parameter that elevates new items in the post-scoring stage with a controlled magnitude; the feedback log captures the resulting clicks and orders so the model picks up the new recipe's actual performance on the next training cycle.)_
- An item goes sold-out mid-session. What in this design keeps it out of the user's next refresh, and where would a delay show up? _(Tests whether the candidate sees the rule layer as the request-time guard: an inventory feed updates the rule layer's view of available items, and the next refresh filters the sold-out item out. A delay shows up if the inventory feed is slow; the rule layer is the boundary, not the model.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/the_meal_kit_that_knows_you)
- [System Design Interview Questions](https://datadriven.io/data-engineering-system-design)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.