# Auth Endpoints

> Not all endpoints are visible to everyone.

Canonical URL: <https://datadriven.io/problems/auth_endpoints>

Domain: SQL · Difficulty: easy · Seniority: L3

## Problem

The security team is auditing authentication traffic across the API layer. Pull every call record whose endpoint contains 'auth' regardless of casing, and show the endpoint alongside its latency.

## Worked solution and explanation

### Why this problem exists in real interviews

The `api_calls` schema makes this a clean test of pattern matching on text columns. Columns like `endpoint`, `method`, `status` introduce enough ambiguity that only candidates who clarify assumptions produce correct results.

---

### Break down the requirements

#### Step 1: Identify the pattern column

The prompt targets `endpoint` for the substring search. Use `LIKE` with wildcards on both sides.

#### Step 2: Select the output columns

Return the columns specified in the prompt: endpoint, latency.

---

### The solution

**Pattern-match for auth endpoints**

```sql
SELECT endpoint, latency
FROM api_calls
WHERE LOWER(endpoint) LIKE '%auth%'
ORDER BY call_id
```

> **Cost Analysis**
>
> The main table has 50M rows (13 GB). Partitioned on `call_time`, so queries filtering on that column skip most partitions.

> **Interviewers Watch For**
>
> Clean, readable SQL with correct column references signals production readiness. Candidates who verbalize their approach before coding score higher on communication.

> **Common Pitfall**
>
> Forgetting case sensitivity in `LIKE` comparisons. Use `LOWER()` on both sides or `ILIKE` where supported.

---

## Common follow-up questions

- The `err_msg` column in `api_calls` has roughly 95% NULLs. How does your query handle those rows, and would the result change if NULLs were replaced with zeros? _(Tests whether the candidate understands how NULLs propagate through aggregation functions and whether their WHERE/JOIN conditions implicitly filter them out.)_
- Your LIKE pattern on `endpoint` is case-sensitive by default. Would switching to ILIKE change your result set, and what is the index cost? _(Tests awareness of case sensitivity in pattern matching and its impact on index usage.)_
- `call_id` in `api_calls` has ~50M distinct values. What index strategy keeps your query from doing a full table scan? _(Tests whether the candidate can design indexes for high-cardinality columns and understands selectivity.)_
- Could you express this same logic as a single query without CTEs or subqueries? What readability trade-off does that introduce? _(Tests whether the candidate can flatten nested logic and understands when decomposition aids maintainability.)_

## Related

- [All practice problems](https://datadriven.io/problems)
- [Mock interview mode](https://datadriven.io/interview/auth_endpoints)
- [SQL Interview Questions](https://datadriven.io/sql-interview-questions)
- [Data Engineering Interview Prep Guide](https://datadriven.io/data-engineer-interview-prep)
- [Daily Challenge](https://datadriven.io/daily)

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.