SQL · Interview concept

SQL CASE WHEN: Conditional Logic for Data Engineers

CASE WHEN is SQL's if/else. It rewrites values inline, buckets data into tiers, and powers conditional aggregation. Used in nearly every interview SQL problem above beginner level.

Try itCASE WHEN in action

Run this as-is to bucket orders into tiers. Then change the thresholds and re-run.

case_when.sql
Click Run to execute. Edit the code above to experiment.

Overview

What CASE WHEN is and why it matters

CASE WHEN is the only branching primitive SQL has. It returns the first matching branch or the ELSE value. You can place it almost anywhere an expression is allowed: SELECT, WHERE, GROUP BY, ORDER BY, HAVING, and inside aggregates like SUM and COUNT.

Interviewers use CASE WHEN as a filter. A candidate who can only write it in SELECT is a beginner. A candidate who uses it inside SUM to count conditional rows, or inside ORDER BY to build custom sort orders, signals mid-to-senior fluency.

Syntax

CASE WHEN reference

TokenMeaning
CASE WHEN <cond> THEN <val>Condition-style: evaluates each WHEN in order, returns the first matching THEN.
CASE <expr> WHEN <val1> THEN ...Simple form: compares a single expression against a list of values.
ELSE <default>Returned when no WHEN matched. If omitted, no-match returns NULL.
ENDRequired terminator. Omitting it is a common syntax error.

Patterns

How CASE WHEN shows up in practice

Bucketing into tiers

The canonical use: convert a continuous value into labeled buckets. Interviewers love this because it exercises ordered conditionals and GROUP BY.

1SELECT
2 CASE
3 WHEN amount < 50 THEN 'small'
4 WHEN amount < 500 THEN 'medium'
5 ELSE 'large'
6 END AS tier,
7 COUNT(*) AS order_count
8FROM orders
9GROUP BY 1

Interview noteThe branches must be ordered from most-specific to least-specific. The first matching WHEN wins.

Conditional aggregation (SUM of a CASE)

The single most-tested CASE WHEN pattern in interviews. Use SUM(CASE WHEN ... THEN 1 ELSE 0 END) to count rows matching a condition without filtering the query.

1SELECT
2 customer_id,
3 COUNT(*) AS total_orders,
4 SUM(
5 CASE
6 WHEN status = 'refunded' THEN 1
7 ELSE 0
8 END
9 ) AS refunds,
10 SUM(
11 CASE
12 WHEN amount > 500 THEN amount
13 ELSE 0
14 END
15 ) AS big_order_revenue
16FROM orders
17GROUP BY customer_id

Interview noteReplaces self-joins and correlated subqueries. Every senior DE should reach for this pattern automatically.

CASE WHEN with multiple conditions (AND / OR)

A WHEN clause can contain any boolean expression. Combine conditions with AND, OR, and parentheses.

1SELECT
2 order_id,
3 CASE
4 WHEN status = 'completed'
5 AND amount >= 1000 THEN 'vip_completed'
6 WHEN status = 'completed'
7 AND amount < 1000 THEN 'regular_completed'
8 WHEN status IN (
9 'refunded',
10 'disputed'
11 ) THEN 'problem'
12 ELSE 'other'
13 END AS segment
14FROM orders

CASE in ORDER BY for custom sort order

Alphabetical order is rarely what you want for status columns. CASE lets you define the order explicitly.

1SELECT
2 order_id,
3 status
4FROM orders
5ORDER BY CASE status
6 WHEN 'pending' THEN 1
7 WHEN 'completed' THEN 2
8 WHEN 'refunded' THEN 3
9 ELSE 4
10END

CASE with NULL handling

CASE treats NULL as a non-match in equality comparisons. Use IS NULL explicitly or COALESCE to handle it.

1SELECT
2 order_id,
3 CASE
4 WHEN status IS NULL THEN 'unknown'
5 WHEN status = 'completed' THEN 'done'
6 ELSE status
7 END AS clean_status
8FROM orders

Interview noteWHEN status = NULL never matches. It is the most common CASE bug and a frequent interview trap.

Interview questions

CASE WHEN interview questions

Q1.How would you count the number of refunded orders per customer without filtering the result set?

What they testWhether you reach for conditional aggregation (SUM of a CASE) instead of a subquery or self-join. This is the bar for mid-level SQL.

ApproachSELECT customer_id, SUM(CASE WHEN status = 'refunded' THEN 1 ELSE 0 END) AS refund_count FROM orders GROUP BY customer_id. Note SUM of 1/0 is preferred over COUNT because COUNT(NULL) does not count.

Q2.What is the difference between CASE WHEN col = 'x' THEN 1 END and CASE WHEN col = 'x' THEN 1 ELSE 0 END inside a SUM?

What they testNULL arithmetic knowledge. Without ELSE, non-matching rows return NULL. SUM ignores NULLs, so the result is the same, but COUNT would differ.

ApproachSUM(CASE WHEN x THEN 1 END) = SUM(CASE WHEN x THEN 1 ELSE 0 END). They are equivalent under SUM. Under COUNT, they differ because COUNT(NULL) = 0 while COUNT(0) = 1.

Q3.Why does WHEN col = NULL never match, and how do you fix it?

What they testThree-valued logic understanding. NULL compared with = always produces UNKNOWN, which CASE treats as no-match.

ApproachUse WHEN col IS NULL. Equality comparisons against NULL return UNKNOWN in SQL's three-valued logic. This is the single most common CASE mistake in interviews.

Q4.Write a query that buckets orders into small / medium / large / enterprise tiers by amount, and returns the revenue per tier.

What they testCombining CASE with GROUP BY. Watch for candidates who forget GROUP BY expression must repeat the CASE (in strict SQL).

ApproachSELECT the CASE as 'tier', aggregate SUM(amount). GROUP BY the same CASE expression or by position (GROUP BY 1). In Postgres/SQL Server you cannot reference the alias in GROUP BY.

Q5.Can you use CASE WHEN inside a WHERE clause? Should you?

What they testWhether you know CASE is an expression (returns a value) and WHERE needs a boolean. The answer is yes, but it is almost always a smell.

ApproachWHERE (CASE WHEN region = 'US' THEN amount > 100 ELSE amount > 50 END) works, but a plain WHERE (region = 'US' AND amount > 100) OR (region <> 'US' AND amount > 50) is clearer and more optimizer-friendly.

Q6.Your CASE WHEN returns unexpected NULLs in one branch. How do you debug it?

What they testDiagnosis skill. First check if the WHEN condition compares against NULL. Then check if the data type coerces unexpectedly (e.g., CASE returning INT in one branch and TEXT in ELSE).

ApproachIsolate each branch with WHERE col = <expected>. If rows match, the CASE works. If not, the predicate fails. Common causes: NULL comparison with =, trimming/whitespace mismatch, implicit type casts forcing NULL on error.

Common mistakes

What breaks in practice

Comparing to NULL with =

WHEN col = NULL always evaluates to UNKNOWN, never TRUE. The branch is silently skipped, and rows fall through to ELSE (or NULL if no ELSE).

Wrong

1CASE WHEN status = NULL THEN 'unknown' ELSE status END

Right

1CASE WHEN status IS NULL THEN 'unknown' ELSE status END

Forgetting ELSE, getting NULLs

If no WHEN matches and there is no ELSE, CASE returns NULL. In downstream calculations, NULL propagates, producing wrong aggregates.

Wrong

1SELECT
2 SUM(CASE
3 WHEN amount > 0 THEN amount
4END)
5FROM orders

Right

1SELECT
2 SUM(
3 CASE
4 WHEN amount > 0 THEN amount
5 ELSE 0
6 END
7 )
8FROM orders

Out-of-order WHEN branches

CASE returns the first matching branch. If broader conditions come before narrower ones, the narrow ones are unreachable.

Wrong

1CASE WHEN amount > 100 THEN 'medium' WHEN amount > 1000 THEN 'large' ELSE 'small' END

Right

1CASE WHEN amount > 1000 THEN 'large' WHEN amount > 100 THEN 'medium' ELSE 'small' END

FAQ

Common questions

What does CASE WHEN do in SQL?
CASE WHEN is SQL's conditional expression. It evaluates branches in order and returns the value from the first branch whose condition is TRUE. If no branch matches and there is no ELSE, it returns NULL.
Can you use CASE WHEN with multiple conditions?
Yes. Each WHEN clause can contain any boolean expression including AND, OR, IN, BETWEEN, and subqueries. Combine conditions with parentheses for clarity.
Why does WHEN column = NULL never match?
SQL uses three-valued logic. Any comparison with NULL using = returns UNKNOWN, which CASE treats as not-a-match. Use WHEN column IS NULL instead.
What is conditional aggregation?
Wrapping a CASE WHEN inside an aggregate like SUM or COUNT to compute metrics over a subset of rows without filtering the whole query. Example: SUM(CASE WHEN status = 'refunded' THEN 1 ELSE 0 END) counts refunded rows per group.
Is CASE WHEN slower than IF or IIF?
In most modern engines, CASE WHEN, IF (MySQL), and IIF (SQL Server) compile to equivalent execution plans. Performance differences come from the underlying predicate complexity, not the syntactic form.

Continue your prep

Data Engineer Interview Prep, explore the full guide

50+ guides covering every round, company, role, and technology in the data engineer interview loop. Grounded in 2,817 verified interview reports across 929 companies, collected from real candidates.