# SQL Practice Problems

> SQL practice problems with multi-seed grading and a Postgres editor for data engineer interview prep.

Canonical URL: <https://datadriven.io/sql-practice-problems>

Breadcrumb: [Home](https://datadriven.io/) > [SQL Practice Problems](https://datadriven.io/sql-practice-problems)

## Summary

905 SQL practice problems graded against a real Postgres 16 warehouse, replayed across 10 randomized seeds per submission. Schema panel pinned to the editor. EXPLAIN button on every problem. Per-problem scratchpad. Autocomplete from the live schema. No signup to start.

## What this page covers

The SQL practice surface here is one browser tab, one editor, one Postgres 16 instance behind a submit button. The catalog is 905 problems pulled from data engineer interview write-ups. The grader replays a submitted query against 10 different seeded versions of the schema. A query that hard-codes IDs, relies on insert order, or omits an ORDER BY tiebreaker fails the same way a senior data engineer would catch it in code review. Typical failing query passes 6 or 7 of 10 seeds with specific diffs naming the bug.

Four editor interactions dominate submission analytics. The schema panel pins next to the editor with table names, column types, primary keys, and three sample rows per table. Autocomplete pulls from the same source so orders.cust_id completes before the data engineer has scrolled the schema. The EXPLAIN button on every problem shows the Postgres plan with row counts, cost estimates, and the chosen join algorithm after a passing submission. Plans for your query and the reference solution sit side by side for the optimization-round comparison. Two output modes toggle with Command-D: row-by-row diff against expected, or full sortable table for inspecting intermediate state. Per-problem scratchpads stay separate from the submission so a data engineer can stage exploratory SELECTs against the seed data before writing the final answer.

The 10-seed engine engineers three failure modes per seed: ties at the top of a ranking, NULL distributions across nullable columns, and cardinality skew on join keys. A common wrong answer to "return the customer with the highest lifetime order total" uses ORDER BY SUM(amount) DESC LIMIT 1. The query passes one fixture and fails every seed where two customers tie at the top, returning "Alice" on seed_3 and "Aliyah" on seed_7. The submission log returns "expected [Alice, Aliyah], got [Aliyah]" with the seed number. The replay-safe version uses a top_total CTE plus an INNER JOIN on ties, takes three more lines, and survives every seed.

Catalog topic distribution. JOINs are 168 problems (20 percent). Aggregation 142 (17 percent). Window functions 124 (15 percent). CTEs 96 (11 percent, with overlap into other topics). Subqueries 88 (10 percent). String and NULL handling 94 (11 percent). Date and time 78 (9 percent). Gap-and-island 64 (8 percent). Each topic has Easy, Medium, and Hard tiers with explicit difficulty calibration. The data engineer who finishes the Easy and Medium tiers in each topic typically reaches phone-screen fluency at around 60 to 80 total problems solved.

Five bugs the 10-seed grader catches that single-fixture SQL trainers let through. A hard-coded value: the query references a specific customer_id or order_id observed in the sample data, passes one seed, and fails the other nine. LIMIT without ORDER BY: Postgres does not guarantee row order without an ORDER BY, so LIMIT 1 picks a different row on each seed. An INNER JOIN that should be LEFT: it drops rows where the right side has no match, works on the seed where every customer has an order, and fails the seed where some do not. COUNT(col) versus COUNT(*): different answers when col is nullable, easy to write, hard to spot, and an interviewer's favorite trap. A window function without a tiebreaker: ROW_NUMBER OVER (ORDER BY ts DESC) is non-deterministic when two rows share the same ts, fixed by adding an event_id DESC tiebreaker. Grader characteristics for the SQL practice surface: 10 randomized seeds per submit that a query must pass in full to be accepted, sub-800ms cold-start latency with hot reruns under 200ms, and roughly 85 percent dialect portability so the same query runs in Snowflake, BigQuery, Redshift, and MySQL.

## Frequently asked questions

### Do I need to install Postgres to use these SQL practice problems?

No. The editor runs in the browser. Submissions go to a real Postgres 16 process server-side. Nothing is installed locally. Any recent browser works.

### Why does the SQL grader use 10 seeds instead of 1?

A single-fixture grader rewards a query that produces the expected rows by coincidence. Hard-coded identifiers, LIMIT 1 without an ORDER BY tiebreaker, LEFT JOIN that should have been INNER, accidental Cartesian explosion from a many-to-many bridge. Replaying against 10 seeds with engineered ties, NULL distributions, and cardinality skew makes those coincidences vanish. The way a senior data engineer asks 'what about NULL' or 'what about ties' as a matter of habit.

### Can I run ad-hoc SELECTs against the seed data before submitting?

Yes. Every SQL practice problem has a scratchpad mode that runs against the same seed Postgres instance the grader uses. SELECT against the tables, eyeball the rows, find the NULL distribution before writing the final answer. The scratchpad does not count against any submission limit. There is no submission limit.

### How do I see the query plan after a SQL submission passes?

The EXPLAIN button on every problem shows the Postgres plan with row counts, cost estimates, and the chosen join algorithm. The reference solution's plan is viewable side by side so a data engineer can compare strategies. Useful for the 'why is this slow' question in optimization rounds at L5 and above.

### Are these SQL practice problems free?

Yes. 905 problems. No daily cap. No signup gate to read or submit. Account is optional and only saves progress across devices.

### What is the topic distribution of the SQL practice catalog?

JOINs are 20 percent (168 problems), aggregation 17 percent (142), window functions 15 percent (124), CTEs 11 percent (96), string and NULL 11 percent (94), subqueries 10 percent (88), date and time 9 percent (78), gap-and-island 8 percent (64). Each topic has Easy, Medium, and Hard tiers.

### How is this different from sql-practice.com or SQLZoo?

sql-practice.com and SQLZoo use SQLite or MySQL 5.7-mode with single-fixture grading. This catalog uses Postgres 16 with 10-seed grading. sql-practice.com has no EXPLAIN view. SQLZoo has no pinned schema panel. Hard-tier problems and most company tags here are free; LeetCode and HackerRank gate them behind Premium.

### Can I practice SQL online without installing anything?

Yes. No local setup, no IDE, no Docker, no database to seed. The editor runs in the browser tab against a managed Postgres 16 instance and returns results in under a second. Practice SQL online from any recent browser; an account is optional and only saves progress across devices.

## How a data engineer uses the SQL practice editor

Four editor interactions that drive the highest first-submission success rate in submission analytics.

### Step 1: Read the schema panel first

Table names, column types, primary keys, three sample rows per table stay pinned next to the editor. Identify the grain (one row per X) and the natural key before typing SELECT.

### Step 2: Stage exploratory SELECTs in the scratchpad

Per-problem scratchpad runs against the same seed Postgres as the grader. Find the NULL distribution, the cardinality of each join key, the date range. There is no submission limit, so explore freely.

### Step 3: Submit and read the per-seed log

Submission returns pass-or-fail per seed with row counts and the specific diff for failing seeds. A 7-of-10 pass with 'row order diverged' on one seed means a missing ORDER BY tiebreaker. A 6-of-10 with 'expected 12 got 11' on a NULL-heavy seed means a JOIN dropped a NULL row.

### Step 4: Open EXPLAIN after passing

Postgres plan with row estimates, costs, and join algorithm. Compare side by side with the reference solution's plan to see how the optimizer routed the query versus the canonical one. Identify sequential scans where index seeks were expected.

## Related practice catalogs

- [SQL interview questions catalog](https://datadriven.io/sql-interview-questions): Same 905 problems organized by interview-frequency tag for data engineer prep.
- [SQL practice online with browser editor](https://datadriven.io/sql-practice-online): The editor surface: schema panel, EXPLAIN button, scratch queries, autocomplete.
- [Window function practice problems](https://datadriven.io/sql-window-functions-interview-questions): 124 window-function problems with frame-clause traps engineered into the seeds.
- [SQL JOIN practice problems](https://datadriven.io/sql-joins-interview-questions): 168 JOIN problems including the many-to-many duplication trap that inflates SUM and COUNT.
- [SQL aggregation practice problems](https://datadriven.io/sql-aggregation-interview-questions): 142 GROUP BY problems with NULL-engineered seeds that catch the AVG-of-CASE bug.
- [Advanced SQL practice for senior data engineer roles](https://datadriven.io/advanced-sql-interview-questions): Recursive CTEs, gap-and-island, sessionization, EXPLAIN reading, skew handling.
- [SQL practice problems with worked solutions](https://datadriven.io/sql-practice-problems-with-solutions): Twenty worked solutions with the why behind each query and the failure modes.
- [Python practice problems for data engineer rounds](https://datadriven.io/python-practice-problems): Pipeline-shaped Python pairs with SQL practice for a full data engineer prep cycle.

---

Source: DataDriven (https://datadriven.io). 100% free data engineering interview prep. Live code execution against Postgres 16, Python 3.11, and Spark sandboxes. No paywall, no premium tier, no signup gate.