SQL NULL Handling Practice

NULL handling appears as a standalone topic in 1.7% of SQL questions, but NULLs cause silent bugs in JOINs, aggregations, and CASE expressions. Interviewers who test NULLs are testing whether you actually debug your own queries.

What this guide covers

NULL handling appears as a standalone topic in ~1.7% of SQL interview questions but silently breaks JOINs, aggregations, and CASE expressions across many more. Six behaviors cover almost every NULL bug: IS NULL syntax, COALESCE, NULLIF, three-valued logic, NULLs in JOINs, NULLs in aggregations. Each has at least one non-obvious edge case that interviewers specifically test.

NULL behavior you need to know

Knowing the syntax isn't enough. You need to know the edge cases.

IS NULL / IS NOT NULL

Syntax: WHERE column IS NULL. The only correct way to check for NULL. NULL = NULL evaluates to NULL (not TRUE), so WHERE column = NULL returns zero rows. This is the most common NULL mistake in interviews. Use cases: filter for missing values in optional fields; find users who never completed onboarding; identify rows that failed to join (LEFT JOIN + IS NULL); data quality checks on required fields. Interview tip: WHERE x = NULL is an immediate red flag. So is WHERE x != NULL — also returns nothing. Always IS NULL / IS NOT NULL.

COALESCE

Syntax: COALESCE(preferred_name, first_name, 'Unknown'). Returns the first non-NULL argument. Takes any number of arguments. The standard way to provide fallback values and handle missing data in calculations. Use cases: default values for display or calculation; merge columns from multiple sources (first non-NULL wins); prevent NULL propagation in arithmetic; handle LEFT JOIN results where the right side is NULL. Interview tip: COALESCE replaces IFNULL/ISNULL/NVL. Use COALESCE because it is ANSI SQL and works in every dialect. Interviewers notice when you use the portable version.

NULLIF

Syntax: NULLIF(column, 0). Returns NULL if the two arguments are equal, otherwise returns the first argument. The primary use case is preventing division by zero: revenue / NULLIF(impressions, 0). Use cases: prevent division by zero errors; convert sentinel values (0, -1, 'N/A') to NULL; clean up data where empty strings should be NULL; avoid misleading averages that include zero-value rows. Interview tip: NULLIF shows up in any question involving ratios and percentages. If you see a division, ask whether the denominator could be zero.

Three-valued logic

SQL uses three values: TRUE, FALSE, NULL (unknown). Any comparison with NULL returns NULL, not FALSE. So NOT (NULL) is also NULL. WHERE clauses only include rows where the condition is TRUE — NULL conditions filter the row out. Use cases: understanding why NOT IN with NULLs returns no rows; predicting WHERE behavior with nullable columns; debugging unexpected empty result sets; writing correct negation queries. Interview tip: WHERE id NOT IN (SELECT id FROM other_table) returns nothing if other_table has any NULL ids. Use NOT EXISTS instead. This question has ended many interviews early.

NULLs in JOINs

NULL values never satisfy a JOIN equality condition because NULL = NULL is NULL (not TRUE). Rows with NULL join keys are silently dropped in INNER JOINs and appear as non-matches in OUTER JOINs. Use cases: debugging missing rows after a JOIN; LEFT JOIN + IS NULL to find unmatched rows; handling optional foreign keys; anti-join patterns (rows in A not in B). Interview tip: when a JOIN produces fewer rows than expected, check for NULLs in the join columns first. The most common cause of 'my query is wrong but I don't know why' in interviews.

NULLs in aggregations

Aggregate functions (SUM, AVG, MIN, MAX) ignore NULL values. COUNT(*) counts rows. COUNT(column) counts non-NULL values. This distinction changes results when your data has gaps. Use cases: COUNT(*) vs COUNT(column) giving different numbers; AVG excluding NULLs (which may or may not be what you want); SUM returning NULL when all values are NULL; calculating completion rates with nullable fields. Interview tip: if asked 'what's the average score?' and some scores are NULL, clarify: should NULLs be zero or excluded? AVG ignores NULLs by default, which inflates the average. COALESCE(score, 0) if zeros are intended.

NULL traps that fail candidates

NULL bugs do not throw errors. They silently change your results. This makes them the hardest category of SQL bug to catch under interview pressure.

Trap 1: NOT IN with NULLs. WHERE id NOT IN (1, 2, NULL) returns zero rows for every possible value of id. The NULL in the list makes the entire NOT IN evaluate to NULL (unknown), and WHERE NULL filters out the row. The single most common NULL trap in interviews. Use NOT EXISTS instead.

Trap 2: Aggregating nullable columns. AVG(score) ignores NULLs entirely. If 8 out of 10 users have a score of 100 and 2 have NULL, the average is 100, not 80. Whether this is correct depends on the business question. Interviewers expect you to ask.

Trap 3: NULL in string concatenation. 'Hello' || NULL returns NULL in standard SQL. One NULL column wipes out the entire concatenated result. Use CONCAT() (treats NULLs as empty strings) or wrap each column in COALESCE.

Trap 4: COUNT(*) vs COUNT(column). COUNT(*) counts rows. COUNT(email) counts non-NULL emails. For the percentage of users with an email, it's COUNT(email) * 1.0 / COUNT(*). Getting this backwards is a silent bug.

Practice problems

Five problems testing NULL handling in realistic scenarios. Each one contains at least one NULL trap.

Medium

The NOT IN trap

employees (id, manager_id) with some manager_id = NULL for top-level employees. Find all employees who are NOT managers. Explain why NOT IN fails and provide a correct solution. Hint: SELECT * FROM employees WHERE id NOT IN (SELECT manager_id FROM employees) returns nothing because the subquery contains NULLs. Use NOT EXISTS, or add WHERE manager_id IS NOT NULL to the subquery.

Medium

Average with missing data

survey_responses (user_id, question_id, score). Not all users answered all questions (missing rows, not NULL scores). Compute average score per question, treating non-respondents as 0. Compare to the naive AVG. Hint: CROSS JOIN users with questions to get all possible pairs. LEFT JOIN to survey_responses. COALESCE(score, 0) inside AVG. The naive AVG on survey_responses alone ignores non-respondents entirely.

Easy

Safe division with NULLIF

ad_campaigns (campaign_id, impressions, clicks, spend). Compute CTR (clicks/impressions) and CPC (spend/clicks). Handle zero impressions or zero clicks without errors. Hint: clicks * 1.0 / NULLIF(impressions, 0) for CTR, spend / NULLIF(clicks, 0) for CPC. NULLIF converts zero to NULL, and dividing by NULL returns NULL instead of erroring.

Medium

Anti-join: users without orders

users (user_id, signup_date) and orders (order_id, user_id, order_date). Find users who signed up in January 2024 but never placed an order. Write it three ways: LEFT JOIN + IS NULL, NOT EXISTS, NOT IN. Explain which is safest. Hint: LEFT JOIN + IS NULL and NOT EXISTS handle NULLs correctly. NOT IN is only safe if orders.user_id has no NULLs, so NOT EXISTS is the safest default.

Hard

NULL-safe deduplication

contacts (id, email, phone, name). Find duplicate pairs where email matches OR phone matches, but either field could be NULL. Two NULLs should NOT count as a match. Return all duplicate pairs with their match reason. Hint: self-join with conditions like (a.email = b.email AND a.email IS NOT NULL) OR (a.phone = b.phone AND a.phone IS NOT NULL). The IS NOT NULL prevents two NULLs from matching. Add a.id < b.id to avoid duplicate pairs.

Worked example: users without orders using COALESCE + LEFT JOIN

SELECT
  u.username,
  u.signup_date,
  COALESCE(COUNT(oi.order_id), 0) AS order_count
FROM users u
LEFT JOIN order_items oi ON oi.user_id = u.user_id
WHERE u.signup_date >= '2024-01-01'
  AND u.signup_date <  '2025-01-01'
GROUP BY u.user_id, u.username, u.signup_date
HAVING COUNT(oi.order_id) = 0
ORDER BY u.signup_date;

LEFT JOIN preserves every user even when there is no matching order. COUNT(o.order_id) returns 0 (not NULL) when all joined values are NULL, because COUNT of a column skips NULLs and returns 0 for an empty set. The HAVING clause filters to users with zero orders. An alternative is LEFT JOIN + WHERE o.order_id IS NULL, which avoids GROUP BY entirely.

NULL handling FAQ

Why does NULL = NULL return NULL instead of TRUE?+
NULL means 'unknown value' in SQL, not 'empty' or 'zero'. Two unknown values cannot be known to be equal, so the result is also unknown (NULL). It is counterintuitive but consistent: if you don't know X and you don't know Y, you cannot say X = Y is true.
When should I use COALESCE vs CASE WHEN for NULL handling?+
COALESCE for simple fallbacks: COALESCE(x, 0) is cleaner than CASE WHEN x IS NULL THEN 0 ELSE x END. CASE WHEN for more complex logic: different replacements for different conditions, or when the NULL check is part of larger conditional logic. COALESCE compiles down to a CASE expression internally, so no performance difference.
How do NULLs affect GROUP BY?+
GROUP BY treats all NULL values as the same group: a single NULL group, not many. This is an exception to NULL != NULL. The SQL standard treats NULLs as 'not distinct' for grouping. Usually what you want, but be aware it's happening.
What's the safest way to write an anti-join?+
NOT EXISTS. Handles NULLs correctly and performs well on all major databases. LEFT JOIN + IS NULL is equally safe and sometimes faster. NOT IN is dangerous: if the subquery returns any NULL, the entire NOT IN evaluates to NULL and you get zero rows. If you must use NOT IN, add WHERE column IS NOT NULL to the subquery.
02 / Why practice

Catch NULL bugs before the interview

  1. 01

    Active recall beats re-reading by 50%

    Cognitive-science meta-reviews (Dunlosky et al., 2013) rank practice testing as a top-tier study technique, while re-reading and highlighting rank near the bottom

  2. 02

    76% of hiring managers reject on the coding task, not the resume

    From HackerRank's 2024 Developer Skills Report. Candidates who look strong on paper still fail the live screen if they haven't done timed, executable practice

  3. 03

    Five problem shapes cover 80% of data engineer loops

    Dedup, sessionization, top-N-per-group, slowly-changing dimensions, partition tricks. Writing the shapes by hand turns the unfamiliar into pattern recognition

Related guides