# Precision and Recall

> Precision and recall. Both matter.

Canonical URL: <https://datadriven.io/problems/precision_and_recall>

Domain: Python · Difficulty: medium · Seniority: L5

## Problem

Given two equal-length lists of binary labels (actual, predicted), return a dict with 'precision' (TP / (TP + FP)) and 'recall' (TP / (TP + FN)). Use 0.0 when the denominator is 0.

## Worked solution and explanation

### Why this problem exists in real interviews

Computing precision and recall from binary labels tests whether you understand **confusion matrix components** (true positives, false positives, false negatives) and can implement them from scratch without sklearn.

---

### Break down the requirements

#### Step 1: Count true positives, false positives, and false negatives

TP: both actual and predicted are 1. FP: predicted is 1 but actual is 0. FN: actual is 1 but predicted is 0.

#### Step 2: Compute precision and recall

Precision = TP / (TP + FP). Recall = TP / (TP + FN). Return 0.0 when the denominator is zero.

---

### The solution

**Confusion matrix counting with zero-division guard**

```python
def precision_recall(actual, predicted):
    tp = 0
    fp = 0
    fn = 0
    for i in range(len(actual)):
        if actual[i] == 1 and predicted[i] == 1:
            tp += 1
        elif actual[i] == 0 and predicted[i] == 1:
            fp += 1
        elif actual[i] == 1 and predicted[i] == 0:
            fn += 1
    precision = tp / (tp + fp) if (tp + fp) > 0 else 0.0
    recall = tp / (tp + fn) if (tp + fn) > 0 else 0.0
    result = {'precision': precision, 'recall': recall}
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n) where n is the number of samples. Single pass through both lists.
> 
> **Space:** O(1). Three counters and the result dict.

> **Interviewers Watch For**
>
> Handling the zero-denominator case explicitly. A model with no positive predictions has undefined precision; returning 0.0 is the standard convention.

> **Common Pitfall**
>
> Confusing precision and recall. Precision measures accuracy of positive predictions (TP / all predicted positives). Recall measures completeness of positive detection (TP / all actual positives).

---

## Common follow-up questions

- How would you compute F1 score from precision and recall? _(Tests the harmonic mean formula: `2 * P * R / (P + R)` with zero-division handling.)_
- What if the labels are multi-class instead of binary? _(Tests macro vs. micro averaging across classes.)_
- What is the tradeoff between precision and recall? _(Tests understanding of the precision-recall curve and threshold tuning.)_
- How would you compute these metrics in a streaming fashion? _(Tests maintaining running TP/FP/FN counters updated incrementally.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/precision_and_recall)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.