SQL Practice Online
A browser tab, an editor, a schema panel, and a Postgres 16 instance behind a submit button. That's the whole product. The catalog is 854 SQL problems pulled from data engineering interview write-ups, and the grader replays your query against 10 different seeded versions of the schema, so a query that hard-codes IDs or relies on insert order fails the same way it would in a code review.
A browser tab, an editor, a schema panel, and a Postgres 16 instance behind a submit button. That's the whole product. The catalog is 854 SQL problems pulled from data engineering interview write-ups, and the grader replays your query against 10 different seeded versions of the schema, so a query that hard-codes IDs or relies on insert order fails the same way it would in a code review.
Know the patterns before the interviewer asks them.
What 10-seed grading actually catches
The replay log below is from a real submission. 2 seeds failed for different reasons, both of which would pass a single-fixture grader.
A practice grader that checks output against a single fixed dataset rewards a query that produces the expected rows by coincidence. The most common form is hard-coding an identifier the candidate saw in the schema panel. The second-most-common form is LIMIT 1 without an ORDER BY tiebreaker, which works on 1 fixture and fails the next.
Replaying the same query against 10 randomized seeds reproduces the conditions a reviewer applies in their head: does this work on data you haven't seen? The submission log below is what a typical replay looks like on a buggy query.
submit @ 2026-05-26T14:09:21Z seed_0 PASS 18 rows 85 ms seed_1 PASS 16 rows 87 ms seed_2 FAIL expected 12, got 11 NULL row missed at line 7 seed_3 PASS 20 rows 91 ms seed_4 FAIL expected 9, got 9 row order diverged (no ORDER BY tiebreaker) seed_5 PASS 17 rows 84 ms seed_6 PASS 15 rows 86 ms seed_7 FAIL expected 14, got 13 LEFT JOIN should have been INNER seed_8 PASS 21 rows 98 ms seed_9 PASS 16 rows 90 ms verdict: 7/10 pass. fix the LEFT/INNER and the missing tiebreaker, resubmit.
Side by side: a lucky-fixture answer and a replay-safe answer
-- Prompt: return the customer with the highest lifetime order total.
-- Schema: customers(customer_id, name), orders(order_id, customer_id, amount)
-- A common wrong answer that passes a single-fixture grader.
-- It happens to return 1 row on the visible test data, but the moment
-- 2 customers tie at the top, it returns nondeterministic results.
SELECT c.name
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.name
ORDER BY SUM(o.amount) DESC
LIMIT 1;
-- The replayed seeds engineer a tie. The query above returns "Alice"
-- on 1 seed and "Aliyah" on another. The error log:
-- seed_3: expected ["Alice","Aliyah"], got ["Aliyah"]
-- seed_7: expected ["Alice","Aliyah"], got ["Alice"]
-- The version that survives every seed handles the tie explicitly.
WITH totals AS (
SELECT c.customer_id, c.name, SUM(o.amount) AS total
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.name
),
top_total AS (SELECT MAX(total) AS top FROM totals)
SELECT t.name
FROM totals t
JOIN top_total ON t.total = top_total.top
ORDER BY t.name;The bug is silent on 1 fixture and obvious on 10. The replay-safe version takes 3 more lines and survives every seed.
Topic distribution in the catalog
8 topics, 854 problems, with Easy/Medium/Hard splits. The percent column is share of total catalog; the count column is total problems in that topic.
INNER, LEFT, anti, self, inequality. The duplication trap in many-to-many is the highest single source of wrong answers in submission logs.
GROUP BY, HAVING, conditional aggregation, COUNT(*) vs COUNT(col). NULL semantics show up in ~40% of these.
ROW_NUMBER for dedup, LAG/LEAD for deltas, SUM OVER for running totals. Frame clause confusion is the top failure mode.
Multi-step transformations and recursive CTEs for hierarchies. The 'just use a subquery' antipattern is graded down.
Correlated, semi-joins via EXISTS, the NOT IN with NULLs trap. Performance bands matter on a few of these.
DATE_TRUNC, intervals, fiscal-period bucketing, sessionization. Timezone correctness checked on 3 of the harder ones.
Streaks, runs, interval merges. The pattern that separates candidates who know window functions from candidates who only know GROUP BY.
3-valued logic, COALESCE chains, parsing log lines. Easy in isolation, brutal when combined with the topics above.
5 patterns that cover most interview SQL
Recognizing one of these in the first 30 seconds of reading a prompt is what separates candidates who finish on time from candidates who don't.
Rewriting ORDER BY ... LIMIT 3 as ROW_NUMBER() OVER (PARTITION BY ...) <= 3. The minute you grasp this rewrite, half the Hard tier becomes Medium.
ROW_NUMBER() OVER (PARTITION BY natural_key ORDER BY updated_at DESC, id) and filter rn = 1. The tiebreaker after updated_at is the part that survives the replay.
Generate a group key with date - ROW_NUMBER() * INTERVAL '1 day', then GROUP BY that key to find streaks. Sounds esoteric until you see it once; afterwards it's the obvious answer to a dozen questions.
LAG() the previous event timestamp, compare against a threshold, accumulate a session ID with SUM() OVER, GROUP BY. 3 CTEs, 1 query, no joins.
Joining a fact at event_time to a Type 2 dimension on effective_from <= event_time AND (effective_to IS NULL OR event_time < effective_to). The half-open interval is the part that interviewers test.
Postgres syntax vs. the rest
What ports cleanly and what doesn't between Postgres, Snowflake, BigQuery, Redshift, and MySQL. Tagged on each problem so you can opt in or out.
| Construct | Postgres | Snowflake | BigQuery | Redshift | MySQL 8 |
|---|---|---|---|---|---|
| Standard SELECT, JOIN, GROUP BY | Portable | Portable | Portable | Portable | Portable |
| Window functions (PARTITION BY, ORDER BY) | Portable | Portable | Portable | Portable | Portable |
| CTEs (WITH clause) | Portable | Portable | Portable | Portable | Portable since 8.0 |
| Recursive CTEs | Portable | Portable | Portable | Portable | Portable since 8.0 |
| FILTER (WHERE ...) on aggregates | Native | Polyfill via CASE | Polyfill via CASE | Native | Polyfill via CASE |
| DISTINCT ON (...) | Native | Not supported | Not supported | Not supported | Not supported |
| RANGE BETWEEN INTERVAL | Native | Native | Native | Limited | Limited |
| LATERAL joins | Native | Equivalent: TABLE(...) | UNNEST | CROSS APPLY analog | Limited |
| NULLS FIRST / NULLS LAST | Native | Native | Native | Workaround | Workaround |
| QUALIFY clause | Not supported | Native | Native | Native | Not supported |
4 editor features you'll actually use
Not a marketing list. These are the 4 interactions that show up most in submission analytics.
Schema panel pinned to the editor
Table names, column types, primary keys, and 3 sample rows per table stay visible while you type. The editor and the schema share the viewport; you never tab away. The autocomplete pulls from the same source, so orders.cust_id completes even before you've scrolled the schema.
EXPLAIN button on every problem
After a passing submission, the EXPLAIN button shows the Postgres query plan with rows, cost estimates, and join algorithm. Useful for the "why is this slow" question that comes up in onsite rounds. You can compare your plan to the reference solution's plan side by side.
2 output modes
Default is row-by-row diff against the expected output. The alternate is full table view with sortable columns, useful when you're working through a problem and want to inspect intermediate state. Toggle with Cmd-D.
Save scratch queries per problem
The editor keeps a per-problem scratchpad separate from the submission. Stage exploratory SELECTs against the seed data before you write the final answer, the way you would against a real warehouse in production.
How this editor compares to the alternatives
Specific capabilities, not slogans. Pricing reflects May 2026 public tiers.
| Capability | DataDriven | sql-practice.com | LeetCode SQL | HackerRank SQL | SQLZoo |
|---|---|---|---|---|---|
| Dialect | Postgres 16 | SQLite | MySQL 5.7 mode | MySQL | MySQL / Oracle |
| Multiple-seed grading | Yes (10) | Single fixed | Single fixed | Single fixed | Single fixed |
| EXPLAIN / query plan view | Yes | No | No | Limited | No |
| Schema panel pinned to editor | Yes | Sidebar tab | Below editor | Below problem | Hidden by default |
| NULL-engineered fixtures | On every relevant problem | Sometimes | Rare | Sometimes | Rare |
| Company-tagged questions | On 312 problems | None | Some (paid) | On most (paid) | None |
| Account required to submit | No | No | Yes | Yes | No |
| Hard tier behind paywall | No | No | Yes | Yes | No |
Questions people actually ask
Do I need to install Postgres to practice SQL here?+
Why 10 seeds per submission instead of 1?+
What dialect should I learn if my target company uses Snowflake or BigQuery?+
Can I run my own ad-hoc SELECTs against the seed data?+
How is this different from LeetCode SQL or HackerRank SQL?+
Are the company tags accurate?+
How long does it take to get fluent enough for a phone screen?+
Open the editor and write your first query
- 01
Active recall beats re-reading by 50%
Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom
- 02
76% of hiring managers reject on the coding task, not the resume
From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice
- 03
Five problem shapes cover 80% of data engineer loops
Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition