# Quantile Calculator

> Mark the boundary value at a given point.

Canonical URL: <https://datadriven.io/problems/quantile_calculator>

Domain: Python · Difficulty: easy · Seniority: L4

## Problem

Given a list of numbers and percentile (0-100), return the value at that percentile using linear interpolation. The index is percentile/100 * (n - 1); if fractional, linearly interpolate between the floor and ceiling indices of the sorted values.

## Worked solution and explanation

### Why this problem exists in real interviews

Computing percentiles with linear interpolation tests **sorting**, **index arithmetic**, and **fractional position handling**. It is a practical SLA monitoring skill that reveals whether you understand how interpolation bridges discrete data points.

> **Trick to Solving**
>
> Sort the data, compute the fractional index as `percentile * (n - 1)`, then interpolate between the floor and ceiling positions. When the index is exact, no interpolation is needed.

---

### Break down the requirements

#### Step 1: Sort the data

Percentile computation requires ordered values.

#### Step 2: Compute the fractional index

Use `percentile * (n - 1)` to find the position in the sorted array corresponding to the desired percentile.

#### Step 3: Interpolate between adjacent values

If the index is not an integer, blend the two surrounding values proportionally.

---

### The solution

**Sort, index, and linearly interpolate**

```python
def quantile(data, percentile):
    sorted_data = sorted(data)
    n = len(sorted_data)
    pos = percentile * (n - 1)
    lower = int(pos)
    upper = lower + 1
    if upper >= n:
        return sorted_data[lower]
    fraction = pos - lower
    result = sorted_data[lower] + fraction * (sorted_data[upper] - sorted_data[lower])
    return result
```

> **Time and Space Complexity**
>
> **Time:** O(n log n) dominated by sorting.
> 
> **Space:** O(n) for the sorted copy.

> **Interviewers Watch For**
>
> Correct interpolation formula. The key is `lower_val + fraction * (upper_val - lower_val)`, which is standard linear interpolation.

> **Common Pitfall**
>
> Off-by-one on the position formula. Different percentile methods (e.g., numpy's 'linear', 'lower', 'higher') use slightly different formulas. The `p * (n-1)` formula matches numpy's default 'linear' method.

---

## Common follow-up questions

- How would you compute the median without sorting? _(Tests quickselect algorithm for O(n) average-case median finding.)_
- What if the data is streaming and you need approximate percentiles? _(Tests knowledge of t-digest or quantile sketches for streaming estimation.)_
- What is the difference between percentile and quantile? _(Tests vocabulary: quantile is the general term; percentile uses a 0-100 scale while quantile uses 0-1.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/quantile_calculator)
- [Python Interview Questions](https://datadriven.io/python-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.