SQL Practice
CASE expressions appear in 1.8% of SQL interview questions as a standalone topic, but they are embedded in 40%+ of real-world aggregation and window function queries. If you can write GROUP BY but not conditional aggregation with CASE WHEN, you are missing the pattern that shows up most. Five problems covering every variant interviewers test.
Each pattern solves a different class of problems. Aggregation is the single most-tested SQL category in data engineering interviews, and most aggregation questions require CASE WHEN for conditional counting or pivoting. The third and fourth patterns below are the ones that separate candidates.
CASE status WHEN 'A' THEN 'Active' WHEN 'I' THEN 'Inactive' ELSE 'Unknown' ENDCompares a single expression against a list of values. Clean and readable when you are matching one column against known constants. The ELSE clause is optional but always include it. Without ELSE, unmatched rows return NULL, which can break downstream logic silently.
Tip: Use simple CASE when every branch compares the same column to a literal value. If you need different columns or operators, switch to searched CASE.
CASE WHEN salary > 100000 THEN 'High' WHEN salary > 60000 THEN 'Mid' ELSE 'Entry' ENDEach WHEN clause has its own boolean expression. More flexible than simple CASE because each condition can reference different columns, use different operators, or combine multiple conditions with AND/OR. Conditions are evaluated in order. The first TRUE branch wins.
Tip: Order matters. Put the most specific condition first. CASE WHEN salary > 60000 before CASE WHEN salary > 100000 means the 100K+ bucket never fires because 60K+ matches first.
SUM(CASE WHEN region = 'US' THEN revenue ELSE 0 END) AS us_revenueThe most powerful CASE WHEN pattern. Wrap a CASE expression inside SUM, COUNT, or AVG to create conditional aggregates. This replaces multiple subqueries, self-joins, or FILTER clauses (which not all engines support). A single pass over the data produces multiple segmented metrics.
Tip: For counting, use SUM(CASE WHEN ... THEN 1 ELSE 0 END) or COUNT(CASE WHEN ... THEN 1 END). Both work. The SUM version is more explicit about what you are counting.
SELECT user_id, MAX(CASE WHEN key = 'email' THEN value END) AS email, MAX(CASE WHEN key = 'phone' THEN value END) AS phone FROM user_attributes GROUP BY user_idTurns rows into columns. Given a key-value table, use CASE WHEN inside MAX (or MIN) to extract each key into its own column. This is the manual pivot pattern. It works in every SQL engine, unlike PIVOT syntax which is engine-specific.
Tip: Use MAX or MIN as the outer aggregate. They work because each group has at most one matching value per key. If there could be multiple values, you need to decide which one to keep.
Progressive difficulty. The first problem tests basic syntax. The last combines CASE WHEN with percentile calculations and CTEs.
Given an employees table, add a column that labels each employee as 'Senior' (salary >= 120000), 'Mid' (salary >= 70000), or 'Junior' (below 70000). Use a searched CASE expression. Order the thresholds correctly so every row gets the right label.
Skills tested: Searched CASE, threshold ordering, column aliasing
Given an orders table with region and amount, produce a single row showing total_revenue, us_revenue, eu_revenue, and apac_revenue. Use CASE WHEN inside SUM for each region. No subqueries, no self-joins, one pass over the data.
Skills tested: CASE inside SUM, conditional aggregation, single-pass computation
Given a user_attributes table with columns (user_id, attribute_key, attribute_value), pivot it so each row has user_id, email, phone, and city as separate columns. Handle users who are missing some attributes (they should show NULL, not be excluded).
Skills tested: CASE for pivoting, MAX aggregation, GROUP BY, NULL handling
Given a user_events table with user_id and event_type (values: page_view, signup, checkout, purchase), calculate: total users at each stage, and the drop-off percentage between consecutive stages. Use CASE WHEN inside COUNT(DISTINCT) for each stage. Compute percentages as calculated columns.
Skills tested: CASE inside COUNT(DISTINCT), calculated columns, funnel analysis
Given a transactions table, create spend buckets based on the overall distribution: bottom 25% of spenders as 'Low', middle 50% as 'Medium', top 25% as 'High'. First compute the 25th and 75th percentile values, then use CASE WHEN to assign each user to a bucket. This requires a CTE or subquery for the percentile thresholds, then CASE WHEN for the labeling.
Skills tested: CASE with dynamic thresholds, percentile computation, CTE + CASE WHEN combination
SQL dominates data engineering interviews. These are the specific ways CASE WHEN shows up. Knowing the pattern lets you recognize the question type immediately.
The interviewer gives you a table and asks for multiple metrics in one query. "Show me the number of active users, churned users, and trial users per month." The expected answer uses SUM(CASE WHEN status = ... THEN 1 ELSE 0 END) for each status. If you reach for three separate queries or subqueries, you lose points.
"Replace NULL values in the discount column with 0." COALESCE is the better tool here, but CASE WHEN discount IS NULL THEN 0 ELSE discount END works too. Interviewers sometimes test whether you know both approaches and when COALESCE is insufficient (e.g., when the replacement logic is more complex than a single default).
"Sort results so that priority = 'urgent' rows come first, then 'high', then 'normal'." The trick: ORDER BY CASE WHEN priority = 'urgent' THEN 1 WHEN priority = 'high' THEN 2 ELSE 3 END. This is a custom sort that does not depend on alphabetical order.
"Apply a discount: 20% off for orders over $500 from VIP customers, 10% off for orders over $500 from regular customers, 5% off for all other orders over $100, no discount otherwise." This requires nested or multi-condition CASE WHEN. The interviewer watches whether you handle the edge cases and order conditions correctly.
Given an orders table with region, order_date, and amount, produce a single summary showing total revenue and a breakdown by region, all in one pass.
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(amount) AS total_revenue,
SUM(CASE WHEN region = 'US' THEN amount ELSE 0 END) AS us_revenue,
SUM(CASE WHEN region = 'EU' THEN amount ELSE 0 END) AS eu_revenue,
SUM(CASE WHEN region = 'APAC' THEN amount ELSE 0 END) AS apac_revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;Each SUM(CASE WHEN ...) acts as a filter inside the aggregate: it adds the amount only when the region matches, otherwise adds 0. This produces a pivot table in a single table scan with no self-joins or subqueries. The pattern works with COUNT, AVG, and MIN/MAX as well.
Expected output
month | total_revenue | us_revenue | eu_revenue | apac_revenue
------------+---------------+------------+------------+-------------
2024-01-01 | 84200.00 | 42100.00 | 28700.00 | 13400.00
2024-02-01 | 91500.00 | 45000.00 | 31200.00 | 15300.00CASE WHEN is not hard to learn. Using it fluently inside aggregations and pivots under time pressure is the real skill. Practice with real tables and instant result comparison.