SQL Practice Problems

1,500 SQL practice problems with worked solutions, graded against a real Postgres warehouse. Sourced from verified data engineering interview reports. 16 topic catalogs, 4 difficulty bands.

1,500 SQL practice problems scored against a real Postgres 16 instance with randomized seed data on every submission. Every problem traces back to a verified data engineering interview report. Pick a topic below or open a random problem to start.

Prepare for the interview
01 / Open invite
02min.

Know the patterns before the interviewer asks them.

a SQL query, the same shape a screen would give you.
The diff against expected. Where ties broke. What you missed.
sandbox
1SELECT user_id,
2 COUNT(*) AS sessions
3FROM events
4WHERE ts >= NOW() - INTERVAL '7 day'
5
Execute your solution0.4s avg.
MicrosoftInterview question
Solve a problem

How often each topic shows up in a real SQL round

Bars show how often each topic appears across verified interview reports. The first four cover most of every round. The bottom topics are senior-signal territory: rare enough that volunteering them lands well, common enough at L5 and above that you can't skip them.

  1. 01GROUP BY and aggregation
    142 problemsEasy → Medium
  2. 02Joins (INNER, LEFT, anti, self)
    168 problemsEasy → Hard
  3. 03Window functions
    124 problemsMedium → Hard
  4. 04CTEs and recursive queries
    96 problemsMedium → Hard
  5. 05Date and time
    58 problemsMedium
  6. 06NULL handling (COALESCE, NULLIF)
    52 problemsEasy → Medium
  7. 07Subqueries and EXISTS
    88 problemsMedium → Hard
  8. 08CASE WHEN and conditional logic
    64 problemsEasy → Medium
  9. 09Deduplication patterns
    32 problemsMedium
  10. 10RANK, DENSE_RANK, ROW_NUMBER
    48 problemsMedium
  11. 11Self joins and inequality joins
    42 problemsMedium → Hard
  12. 12PARTITION BY patterns
    56 problemsMedium → Hard
  13. 13Gaps and islands
    38 problemsHard
  14. 14Pivot and unpivot
    36 problemsMedium → Hard
  15. 15String and regex
    28 problemsEasy → Medium
  16. 16Recursive CTEs
    22 problemsHard

Why a real database changes how you practice

LeetCode and most online SQL trainers check your output by string-matching against a single reference dataset. You can pass by reading the expected output and writing SQL that happens to produce those rows. That's not how a real interview assesses you, and it's not how this grader works.

Every submission spins up a fresh Postgres database with randomized seed data inside the problem's schema. A query that hard-codes a customer_id, depends on insertion order, or uses LIMIT without an ORDER BY tiebreaker passes one or two seeds and fails the rest. The feedback names which seed broke and what changed between it and the seeds that passed.

NULL handling is the other thing this catches. A naive COUNT(col) versus COUNT(*), an INNER JOIN that should have been LEFT, a comparison against NULL that silently returns UNKNOWN. The seeds carry NULLs in the columns where interviewers know to test.

What a scored problem looks like

Medium difficulty, window function with dedup. Roughly one in three real SQL rounds opens with a problem of this shape.

Prompt: Find users whose third login fell within 30 days of signup. The grader runs your query against ten random seed datasets; rows must match for every one. ORDER BY is checked. NULLs in login_at trip the input validator before scoring begins.

-- Find users whose third login was within 30 days of signup.
-- Schema: users(user_id, signup_at), logins(user_id, login_at)

WITH ranked AS (
  SELECT
    user_id,
    login_at,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_at) AS n
  FROM logins
)
SELECT u.user_id
FROM users u
JOIN ranked r ON r.user_id = u.user_id AND r.n = 3
WHERE r.login_at <= u.signup_at + INTERVAL '30 days';

The common wrong answer uses LIMIT 1 OFFSET 2 inside a per-user subquery. It passes one fixture and fails the other nine because LIMIT without ORDER BY is non-deterministic in Postgres.

How the adaptive engine picks the next problem

After five or six problems, the engine knows which topic-difficulty cells you've cleared. The next problem is sampled from the edge: the topic you most recently got wrong, at the difficulty band one rung below where you stalled.

There's no streak counter, no XP, no daily quest. The mechanic borrows from the same spaced-repetition research that runs Anki: a problem you missed three weeks ago surfaces again before you've forgotten the fix, and a topic you've nailed three times in a row drops out of rotation for a while.

If you want to override, set the topic and difficulty filters explicitly. The engine logs that choice and treats it as signal too.

Prepare for the interview
03 / From the bank03 of many
03hand-picked.

Top Active Senders per Channel

Medium26 min

Top three messages per channel by replies.

Pulled from debriefs where SQL was the gate.

How the grader actually runs your query

3 numbers worth knowing before you submit anything.

10
Random seeds per submit
Your query must produce the right rows for every one.
<800ms
Cold-start latency
Hot reruns are closer to 200ms.
85%
Dialect-portable
Runs the same in Snowflake, BigQuery, Redshift, and MySQL.

Dialect notes

The grader runs Postgres 16. About 85 percent of the bank uses ANSI-portable syntax that runs the same in Snowflake, BigQuery, Redshift, and MySQL. Postgres-specific features (FILTER on aggregates, RANGE BETWEEN window frames, lateral joins, DISTINCT ON) are tagged on the problem so you can skip them or use them as a chance to learn.

If you're targeting Spark SQL or BigQuery and want to stay dialect-pure, the 'dialect-clean' toggle at the top of /problems narrows the bank to portable syntax only.

How to mix your practice time

What works for candidates who pass: 80 to 150 problems total, distributed by difficulty rather than topic. The bars show the recommended share of your time, not the share of problems.

  1. Easy
    30 problems

    Build fluency with the grader and the schema panel. Get GROUP BY and basic JOIN to reflex.

  2. Medium
    60 problems

    Distributed across joins, window functions, GROUP BY with HAVING, CTEs, conditional aggregation. Most of your time goes here.

  3. Hard
    20 problems

    Weighted toward whichever topic the company you're targeting tends to ask. Saved for the final week.

  4. Mock interviews
    10 sessions

    Timed, vague prompts, AI follow-ups. The practice that closes the gap between knowing the answer and saying it out loud.

Common questions

Are these SQL practice problems really free?+
Yes. Every problem, the grader, the worked solutions. The catalog isn't paywalled and never has been. Sign-in is optional and only saves your progress across devices.
How is this different from LeetCode SQL?+
LeetCode SQL leans on puzzle-style problems with one fixture and clever-trick solutions. The bank here is built from data engineering interview debriefs: deduping a Type 2 SCD, sessionizing events, late-arriving data, the kinds of queries that show up on the job. The grader runs against ten randomized fixtures, which catches the lucky-fixture solutions LeetCode lets through.
Can I practice SQL online for free without signup?+
Yes. The whole catalog is open without an account. Submit queries, read the grader output, read the worked solution. Sign-in adds saved progress and unlocks the adaptive engine that picks your next problem.
What SQL dialect should I expect?+
Postgres 16. About 85 percent of the bank is portable to Snowflake, BigQuery, Redshift, and MySQL. Postgres-specific syntax (FILTER, RANGE BETWEEN, lateral joins, DISTINCT ON) is tagged. The dialect filter narrows to portable SQL only.
What are the best SQL practice problems for interview prep?+
The ones that match the shape of the questions you'll actually be asked. Based on real interview reports, the four topics worth spending most of your time on are window functions, joins (especially the duplication-from-many-to-many trap), GROUP BY with HAVING, and CTEs. Those four make up the bulk of the verified-report distribution.
How many SQL problems should I solve before a loop?+
80 to 150 if you're solving them properly. A problem you wrote, ran, debugged, and could re-derive a week later is worth 10 you skimmed. The signal you're ready isn't a count. It's writing a Medium window-function problem in under 12 minutes without looking up syntax.
Are problems sourced from real interviews?+
Every problem traces back to at least one verified interview report. The catalog is built from 2,817 reports across 920 companies. Specific company names are tagged on the problem only when the report explicitly cited the company.
02 / Why practice

Pick the topic you've been avoiding

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition