# Verbose by Design

> Audit endpoint paths. Length without the outer slashes, and how many segments.

Canonical URL: <https://datadriven.io/problems/endpoint_name_word_count>

Domain: SQL · Difficulty: hard · Seniority: L4

## Problem

The API governance team is flagging overly complex endpoint paths for cleanup. For each endpoint, count the number of meaningful segments (ignoring path separators). Show each endpoint, the trimmed length, and its segment count.

## Worked solution and explanation

### Why this problem exists in real interviews

Paths in API logs are inconsistent: '/api/v1/users', 'api/v1/users', and '/api/v1/users/' all describe the same route. Platform teams audit route naming by measuring length and segment count after normalization. The question tests two classic string tricks: two-argument `TRIM` for character stripping, and `LENGTH - LENGTH(REPLACE)` for delimiter counting without splitting the string.

> **Trick to Solving**
>
> Two primitives do all the work. `TRIM(endpoint, '/')` strips leading and trailing slashes. `LENGTH(s) - LENGTH(REPLACE(s, '/', "))` counts how many slashes are left inside, and segment count is that number plus one.

---

### Break down the requirements

#### Step 1: Strip leading and trailing slashes

`TRIM(endpoint, '/')` removes outer slashes. `trimmed_len = LENGTH(TRIM(endpoint, '/'))` is the length after trimming. Two distinct raw endpoints ('/a/b' and '/a/b/') stay distinct in the output even if their trimmed forms are identical.

#### Step 2: Count interior slashes to get segment count

The number of `/` characters in the trimmed string is `LENGTH(trimmed) - LENGTH(REPLACE(trimmed, '/', "))`. The segment count is that value plus one.

#### Step 3: Collapse to distinct endpoints

One endpoint can appear in millions of `api_calls` rows. `SELECT DISTINCT` returns one row per raw endpoint string.

#### Step 4: Return three columns in order

Return `endpoint`, `trimmed_len`, `word_count`, in that order. Sort alphabetically by `endpoint`.

---

### The solution

**Trim, count, deduplicate**

```sql
SELECT DISTINCT endpoint,
       LENGTH(TRIM(endpoint, '/')) AS trimmed_len,
       LENGTH(TRIM(endpoint, '/'))
         - LENGTH(REPLACE(TRIM(endpoint, '/'), '/', ''))
         + 1 AS word_count
FROM api_calls
ORDER BY endpoint
```

> **Cost Analysis**
>
> `api_calls` models 400M rows. The string operations run per row before `DISTINCT` collapses the result, so cost is dominated by the full scan. No index helps because every row contributes. In production this would run against a daily aggregate table keyed by `endpoint`, not the raw log.

> **Interviewers Watch For**
>
> Two signals. First, the candidate knows `LENGTH - LENGTH(REPLACE)` for counting occurrences. Second, the candidate normalizes (TRIM) before counting instead of special-casing leading/trailing slashes. A strong candidate also asks whether empty segments ('/a//b') should count.

> **Common Pitfall**
>
> Forgetting to `TRIM` inflates `word_count` by the number of outer slashes. `/a/b/c/` reports `word_count = 5` instead of `3`. The other trap: using `SUBSTR`/`INSTR` loops to split the string, which works but is an order of magnitude slower than the length-difference trick.

---

## Common follow-up questions

- How would you extract just the second path segment after trimming? _(In SQLite: `SUBSTR` combined with `INSTR` locates the first and second slash. In Postgres: `SPLIT_PART(TRIM(endpoint, '/'), '/', 2)` does it directly.)_
- What should `word_count` return for an endpoint like `/a//b` with an empty segment? _(`LENGTH - LENGTH(REPLACE)` counts delimiter characters, not non-empty segments, so `/a//b` reports 3. Deciding whether that is correct is a product question.)_
- How would you return only endpoints with more than five segments? _(Wrap the query as a subquery and filter on `word_count > 5` in the outer `WHERE`. You cannot reference the alias directly in the same-level `WHERE`.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/endpoint_name_word_count)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.