# First Interaction Credit

> Attribute transactions to earliest touchpoint

Canonical URL: <https://datadriven.io/problems/first_interaction_credit>

Domain: SQL · Difficulty: hard · Seniority: L5

## Problem

The attribution team uses a first-touch model: every conversion is credited to the very first ad impression the user ever received, regardless of whether they clicked. A user counts as converted if they appear in the transactions table. For each converted user, return their user_id, the ad_campaign from their earliest impression, and that impression_time.

## Worked solution and explanation

### What this is really asking

Every row in ad_impressions is a candidate, but only users who appear in transactions count. Earliest impression per converted user, by impression_time. The clicked flag is a decoy; first-touch ignores it.

---

### Break down the requirements

#### Step 1: Restrict to converters

Semi-join ad_impressions to transactions on user_id via EXISTS. INNER JOIN works but fans out impressions by purchase count.

#### Step 2: Pick the earliest impression

ROW_NUMBER PARTITION BY user_id ORDER BY impression_time. Filter rn = 1 outside the CTE since windows cannot live in WHERE.

#### Step 3: Return the trio

user_id, ad_campaign, impression_time from the surviving row. No aggregation needed once dedup is done.

---

### The solution

**FIRST IMPRESSION PER CONVERTED USER**

```sql
WITH ranked AS (
  SELECT
    i.user_id,
    i.ad_campaign,
    i.impression_time,
    ROW_NUMBER() OVER (
      PARTITION BY i.user_id
      ORDER BY i.impression_time
    ) AS rn
  FROM ad_impressions i
  WHERE EXISTS (
    SELECT 1 FROM transactions t WHERE t.user_id = i.user_id
  )
)
SELECT user_id, ad_campaign, impression_time
FROM ranked
WHERE rn = 1;
```

> **Cost Analysis**
>
> EXISTS builds a hash semi-join on distinct converter user_ids, much smaller than 200M transactions. Pushing that filter before the window shrinks the sort from 500M rows to whatever survives.

> **Interviewers Watch For**
>
> INNER JOIN to transactions before the window multiplies impressions by purchases per user. The answer survives the dedup, but the work is wasted. EXISTS makes the intent explicit.

> **Common Pitfall**
>
> Filtering WHERE clicked = TRUE. The prompt credits the first impression regardless of click; that predicate silently drops any user whose first impression was a view.

> **The False Start**
>
> First instinct is GROUP BY user_id with MIN(impression_time), then self-join to grab ad_campaign. The join reintroduces ties when two impressions share a timestamp. Pivot to ROW_NUMBER, which picks exactly one.

---

### COMMON FOLLOW-UP QUESTIONS

## Common follow-up questions

- How would you switch this to a last-touch model? _(Flip the ORDER BY to DESC; the rest is identical.)_
- What if two impressions share an impression_time for the same user? _(ROW_NUMBER picks one nondeterministically. Add impression_id as a tiebreaker, or switch to RANK.)_
- How do you handle users who converted before their first impression? _(Add transaction_date >= impression_time inside the EXISTS, else you credit campaigns for purchases they could not have caused.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/first_interaction_credit)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.