SQL Practice Online

A browser tab, an editor, a schema panel, and a Postgres 16 instance behind a submit button. That's the whole product. The catalog is 854 SQL problems pulled from data engineering interview write-ups, and the grader replays your query against 10 different seeded versions of the schema, so a query that hard-codes IDs or relies on insert order fails the same way it would in a code review.

A browser tab, an editor, a schema panel, and a Postgres 16 instance behind a submit button. That's the whole product. The catalog is 854 SQL problems pulled from data engineering interview write-ups, and the grader replays your query against 10 different seeded versions of the schema, so a query that hard-codes IDs or relies on insert order fails the same way it would in a code review.

Prepare for the interview
01 / Open invite
02min.

Know the patterns before the interviewer asks them.

a SQL query, the same shape a screen would give you.
The diff against expected. Where ties broke. What you missed.
sandbox
1SELECT user_id,
2 COUNT(*) AS sessions
3FROM events
4WHERE ts >= NOW() - INTERVAL '7 day'
5
Execute your solution0.4s avg.
MicrosoftInterview question
Solve a problem
854
Graded problems
10
Seeds per submission
Postgres 16
Live engine
~1.4s
Median submit-to-result

What 10-seed grading actually catches

The replay log below is from a real submission. 2 seeds failed for different reasons, both of which would pass a single-fixture grader.

A practice grader that checks output against a single fixed dataset rewards a query that produces the expected rows by coincidence. The most common form is hard-coding an identifier the candidate saw in the schema panel. The second-most-common form is LIMIT 1 without an ORDER BY tiebreaker, which works on 1 fixture and fails the next.

Replaying the same query against 10 randomized seeds reproduces the conditions a reviewer applies in their head: does this work on data you haven't seen? The submission log below is what a typical replay looks like on a buggy query.

submit @ 2026-05-26T14:09:21Z
  seed_0  PASS  18 rows         85 ms
  seed_1  PASS  16 rows         87 ms
  seed_2  FAIL  expected 12, got 11    NULL row missed at line 7
  seed_3  PASS  20 rows         91 ms
  seed_4  FAIL  expected 9,  got 9     row order diverged (no ORDER BY tiebreaker)
  seed_5  PASS  17 rows         84 ms
  seed_6  PASS  15 rows         86 ms
  seed_7  FAIL  expected 14, got 13    LEFT JOIN should have been INNER
  seed_8  PASS  21 rows         98 ms
  seed_9  PASS  16 rows         90 ms

verdict: 7/10 pass. fix the LEFT/INNER and the missing tiebreaker, resubmit.

Side by side: a lucky-fixture answer and a replay-safe answer

-- Prompt: return the customer with the highest lifetime order total.
-- Schema: customers(customer_id, name), orders(order_id, customer_id, amount)

-- A common wrong answer that passes a single-fixture grader.
-- It happens to return 1 row on the visible test data, but the moment
-- 2 customers tie at the top, it returns nondeterministic results.
SELECT c.name
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.name
ORDER BY SUM(o.amount) DESC
LIMIT 1;

-- The replayed seeds engineer a tie. The query above returns "Alice"
-- on 1 seed and "Aliyah" on another. The error log:
--   seed_3: expected ["Alice","Aliyah"], got ["Aliyah"]
--   seed_7: expected ["Alice","Aliyah"], got ["Alice"]

-- The version that survives every seed handles the tie explicitly.
WITH totals AS (
  SELECT c.customer_id, c.name, SUM(o.amount) AS total
  FROM customers c
  JOIN orders o ON o.customer_id = c.customer_id
  GROUP BY c.customer_id, c.name
),
top_total AS (SELECT MAX(total) AS top FROM totals)
SELECT t.name
FROM totals t
JOIN top_total ON t.total = top_total.top
ORDER BY t.name;

The bug is silent on 1 fixture and obvious on 10. The replay-safe version takes 3 more lines and survives every seed.

Topic distribution in the catalog

8 topics, 854 problems, with Easy/Medium/Hard splits. The percent column is share of total catalog; the count column is total problems in that topic.

JOINs168 problems · 20%
12E·28M·13H

INNER, LEFT, anti, self, inequality. The duplication trap in many-to-many is the highest single source of wrong answers in submission logs.

Aggregation142 problems · 17%
15E·22M·9H

GROUP BY, HAVING, conditional aggregation, COUNT(*) vs COUNT(col). NULL semantics show up in ~40% of these.

Window functions124 problems · 15%
6E·24M·18H

ROW_NUMBER for dedup, LAG/LEAD for deltas, SUM OVER for running totals. Frame clause confusion is the top failure mode.

CTEs96 problems · 11%
4E·14M·10H

Multi-step transformations and recursive CTEs for hierarchies. The 'just use a subquery' antipattern is graded down.

Subqueries88 problems · 10%
5E·12M·6H

Correlated, semi-joins via EXISTS, the NOT IN with NULLs trap. Performance bands matter on a few of these.

Date and time78 problems · 9%
6E·11M·4H

DATE_TRUNC, intervals, fiscal-period bucketing, sessionization. Timezone correctness checked on 3 of the harder ones.

Gaps and islands64 problems · 8%
1E·8M·11H

Streaks, runs, interval merges. The pattern that separates candidates who know window functions from candidates who only know GROUP BY.

String and NULL handling94 problems · 11%
14E·11M·2H

3-valued logic, COALESCE chains, parsing log lines. Easy in isolation, brutal when combined with the topics above.

5 patterns that cover most interview SQL

Recognizing one of these in the first 30 seconds of reading a prompt is what separates candidates who finish on time from candidates who don't.

Pattern: top-N per group

Rewriting ORDER BY ... LIMIT 3 as ROW_NUMBER() OVER (PARTITION BY ...) <= 3. The minute you grasp this rewrite, half the Hard tier becomes Medium.

Pattern: deduplicate by natural key

ROW_NUMBER() OVER (PARTITION BY natural_key ORDER BY updated_at DESC, id) and filter rn = 1. The tiebreaker after updated_at is the part that survives the replay.

Pattern: gap and island

Generate a group key with date - ROW_NUMBER() * INTERVAL '1 day', then GROUP BY that key to find streaks. Sounds esoteric until you see it once; afterwards it's the obvious answer to a dozen questions.

Pattern: sessionization

LAG() the previous event timestamp, compare against a threshold, accumulate a session ID with SUM() OVER, GROUP BY. 3 CTEs, 1 query, no joins.

Pattern: slowly changing dimension lookups

Joining a fact at event_time to a Type 2 dimension on effective_from <= event_time AND (effective_to IS NULL OR event_time < effective_to). The half-open interval is the part that interviewers test.

Postgres syntax vs. the rest

What ports cleanly and what doesn't between Postgres, Snowflake, BigQuery, Redshift, and MySQL. Tagged on each problem so you can opt in or out.

ConstructPostgresSnowflakeBigQueryRedshiftMySQL 8
Standard SELECT, JOIN, GROUP BYPortablePortablePortablePortablePortable
Window functions (PARTITION BY, ORDER BY)PortablePortablePortablePortablePortable
CTEs (WITH clause)PortablePortablePortablePortablePortable since 8.0
Recursive CTEsPortablePortablePortablePortablePortable since 8.0
FILTER (WHERE ...) on aggregatesNativePolyfill via CASEPolyfill via CASENativePolyfill via CASE
DISTINCT ON (...)NativeNot supportedNot supportedNot supportedNot supported
RANGE BETWEEN INTERVALNativeNativeNativeLimitedLimited
LATERAL joinsNativeEquivalent: TABLE(...)UNNESTCROSS APPLY analogLimited
NULLS FIRST / NULLS LASTNativeNativeNativeWorkaroundWorkaround
QUALIFY clauseNot supportedNativeNativeNativeNot supported

4 editor features you'll actually use

Not a marketing list. These are the 4 interactions that show up most in submission analytics.

Schema panel pinned to the editor

Table names, column types, primary keys, and 3 sample rows per table stay visible while you type. The editor and the schema share the viewport; you never tab away. The autocomplete pulls from the same source, so orders.cust_id completes even before you've scrolled the schema.

EXPLAIN button on every problem

After a passing submission, the EXPLAIN button shows the Postgres query plan with rows, cost estimates, and join algorithm. Useful for the "why is this slow" question that comes up in onsite rounds. You can compare your plan to the reference solution's plan side by side.

2 output modes

Default is row-by-row diff against the expected output. The alternate is full table view with sortable columns, useful when you're working through a problem and want to inspect intermediate state. Toggle with Cmd-D.

Save scratch queries per problem

The editor keeps a per-problem scratchpad separate from the submission. Stage exploratory SELECTs against the seed data before you write the final answer, the way you would against a real warehouse in production.

How this editor compares to the alternatives

Specific capabilities, not slogans. Pricing reflects May 2026 public tiers.

CapabilityDataDrivensql-practice.comLeetCode SQLHackerRank SQLSQLZoo
DialectPostgres 16SQLiteMySQL 5.7 modeMySQLMySQL / Oracle
Multiple-seed gradingYes (10)Single fixedSingle fixedSingle fixedSingle fixed
EXPLAIN / query plan viewYesNoNoLimitedNo
Schema panel pinned to editorYesSidebar tabBelow editorBelow problemHidden by default
NULL-engineered fixturesOn every relevant problemSometimesRareSometimesRare
Company-tagged questionsOn 312 problemsNoneSome (paid)On most (paid)None
Account required to submitNoNoYesYesNo
Hard tier behind paywallNoNoYesYesNo

Questions people actually ask

Do I need to install Postgres to practice SQL here?+
No. The editor and the database both live in the browser. Submitting a query sends it to a real Postgres 16 process server-side and returns the result rows. Nothing is installed locally and nothing about your environment matters beyond a recent browser.
Why 10 seeds per submission instead of 1?+
Single-fixture grading rewards lucky answers. If the expected output has 8 rows and your buggy query happens to return those same 8 rows on the only fixture the grader checks, you pass without learning anything. 10 seeds make that coincidence vanishingly unlikely, the same way a code reviewer asks 'what about NULL?' or 'what about ties?' as a matter of habit.
What dialect should I learn if my target company uses Snowflake or BigQuery?+
Practice in Postgres 16 here. About 85 percent of problems are dialect-portable; the Postgres-specific syntax (FILTER, DISTINCT ON, RANGE BETWEEN INTERVAL, lateral joins) is tagged on each problem so you can skip it or treat it as new material. The window functions, CTEs, and aggregation patterns that dominate interview SQL behave identically across Snowflake, BigQuery, Redshift, and Postgres.
Can I run my own ad-hoc SELECTs against the seed data?+
Yes. Every problem has a scratchpad mode that runs against the same seed Postgres instance the grader uses. SELECT against the tables, eyeball the rows, find the NULL distribution before you write the answer. The scratchpad is the closest equivalent to having a warehouse to poke at while you build a real query.
How is this different from LeetCode SQL or HackerRank SQL?+
Different in 3 measurable ways. LeetCode SQL grades against 1 fixed dataset, here it is 10. LeetCode SQL leans on puzzle problems, here the source is pipeline patterns from interview write-ups. LeetCode's Medium and Hard tiers and most company-tagged questions are behind Premium; here the entire 854-problem catalog is open.
Are the company tags accurate?+
Tags appear on a problem only when at least 1 interview write-up explicitly cited the company in the same question shape. The catalog covers 312 company-tagged problems across 76 named employers. If a problem doesn't have a tag, it means the shape is common but no specific report attached it to a particular company.
How long does it take to get fluent enough for a phone screen?+
Most candidates reach phone-screen fluency at 50 to 80 problems solved across the Easy and Medium tiers, distributed across joins, aggregation, and window functions. The bottleneck is rarely volume; it's pattern recognition. The 5 patterns in the section above account for the majority of interview SQL.
02 / Why practice

Open the editor and write your first query

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Practice paths from here