NULL Values: Beginner
In 2012, Knight Capital lost $440 million in 45 minutes when a deployment error left dead code active on one of eight servers, and that server silently routed millions of market orders without the risk checks applied by the other seven. No error was thrown; the system appeared healthy in every log while hemorrhaging money at roughly $10 million per minute. The incident, documented in SEC Order 34-70694 and Doug Seven's widely-read post-mortem, is a canonical example of a silent, non-crashing failure. NULL values create this same class of failure in SQL: they are not zero, not empty string, and they do not equal themselves, so a single missed IS NULL check silently drops rows or corrupts aggregations with no error and no warning. In this lesson you will learn exactly what NULL means, how to safely detect it, and how it changes the behavior of every function that touches it.
NULL vs zero vs empty string
Tell the difference between empty and missing
NULL is not a value; it is the absence of a value. Think of it as a placeholder that says "we don't know what goes here" or "this information wasn't provided." This is fundamentally different from zero, an empty string, or the word "null" as text.
Understanding NULL
- NULL: Unknown or not provided. Example: user never entered age
- 0: A known quantity of zero. Example: account balance is $0
- '' (empty string): An empty text value. Example: name field left blank
Imagine a survey asking for income. NULL means the person skipped the question entirely. Zero means they answered and said they have no income. An empty string would typically be an input error.
Purpose of NULL
Real-world data is incomplete. Not every customer provides a phone number. Not every product has a discount price. Not every employee has a manager. NULL lets databases store this reality without forcing fake values into columns.
Sarah Chen has NULL for manager_id because she is the CEO and has no manager. This is accurate information, not missing data.
> Complete this query to find orders with missing region data.
SELECT order_id, region FROM orders WHERE region
Understanding the distinction between NULL, zero, and empty string is foundational because each requires a different query pattern and represents a fundamentally different real-world situation.
NULL as a concept reflects an honest acknowledgment that some information is simply unavailable, and databases that store NULL correctly are more truthful than those that substitute zeros or placeholder text.
When you encounter unexpected results in a query involving optional columns, checking whether those columns contain NULL values is almost always the first diagnostic step to take.
Three-Valued Logic
Predict how NULLs change query results
Most programming uses two-valued logic: things are either true or false. SQL uses three-valued logic: true, false, or unknown. Any comparison involving NULL returns unknown, not true or false.
Three-Valued Logic
NULL is not equal to anything, including itself. This surprises many beginners but makes logical sense: if you don't know what two things are, you can't know if they're equal.WHERE Clause Behavior
In WHERE clauses, rows only pass through when the condition is true. Unknown conditions filter out the row just like false does.
Users with NULL ages are excluded because comparing NULL to any value yields unknown. They don't appear in the results even though they might be 25 years old.
IS NULL vs = NULL
Which query correctly finds rows where region is NULL?
Now practice with an aggregate function to see how NULL values are excluded from calculations.
> Complete the query to calculate the average discount, keeping in mind how missing values are handled.
SELECT () AS avg_discount FROM orders
Three-valued logic is not just an academic curiosity; it has practical consequences every time you write a WHERE clause that filters on a column that can be NULL, because UNKNOWN rows are silently excluded from the result.
The reason NULL does not equal itself is logically sound: if the value is unknown, you cannot make any truthful claim about whether two unknown values are the same or different.
Building the habit of asking "can this column be NULL?" before writing comparison conditions prevents an entire class of silent filtering errors that produce results that look correct but exclude important rows.
IS NULL and IS NOT NULL operators
Find or exclude missing values
Since you can't use = or != to check for NULL, SQL provides special operators: IS NULL and IS NOT NULL. These are the only correct ways to filter for NULL values.
Basic NULL Checks
Use IS NULL to find rows where a column has no value:
This returns only customers who have not provided a phone number. Using WHERE phone = NULL would return zero rows because that comparison is always UNKNOWN.
IS NOT NULL
Use IS NOT NULL to find rows where a column has any value:
This returns customers who have provided phone numbers, excluding anyone with NULL.
- Always use IS NULL and IS NOT NULL to test for NULL
- Use COALESCE to provide default values for NULL columns
- Account for NULL in every WHERE clause that filters optional data
- Don't use = NULL or != NULL because they always return unknown
- Don't assume NULL behaves like zero or empty string
- Don't forget that NULL propagates through arithmetic
Common Mistakes
The first query returns nothing because comparing NULL to itself doesn't yield true. The second query correctly identifies NULL values using IS NULL.
> Complete this query to find only customers who have provided a phone number.
SELECT name, phone FROM customers WHERE phone
IS NULL and IS NOT NULL are the only correct tools for testing whether a value is absent, and memorizing this as a rule rather than a guideline will save you from a common category of hard-to-debug errors.
IS NOT NULL is especially useful at the start of a WHERE clause to exclude incomplete records before applying further conditions, ensuring those conditions are evaluated against known values rather than unknowns.
The combination of IS NULL for finding missing data and IS NOT NULL for excluding it covers the vast majority of NULL-filtering scenarios you will encounter in practice.
COUNT(*) vs COUNT(column)
Count accurately when data is missing
Aggregate functions like COUNT, SUM, AVG, MIN, and MAX handle NULL in specific ways that can produce unexpected results if you don't understand the rules.
COUNT Behaviors
Consider this orders table with some NULL values in the discount column:
| order_id | amount | discount |
|---|---|---|
| ORD-001 | 100 | 10 |
| ORD-002 | 200 | NULL |
| ORD-003 | 150 | 15 |
| ORD-004 | 300 | NULL |
| ORD-005 | 250 | 25 |
COUNT Behaviors
COUNT has two distinct behaviors depending on what you count:
COUNT(*)counts all rows, including those withNULLvaluesCOUNT(column)counts only rows where that column isNOT NULL- This difference is intentional and very useful for data quality checks
> Complete this query to count only the valid discount values, skipping any missing entries.
SELECT () AS non_null_count FROM orders
The difference between COUNT(*) and COUNT(column) is one of the most practical distinctions in SQL, because using the wrong one can make a dataset appear more complete than it actually is.
Using both COUNT(*) and COUNT(column) together in a single query is a quick and readable data quality check that immediately shows how many rows have a value in the column of interest.
When reporting on data completeness to stakeholders, the gap between COUNT(*) and COUNT(column) is the number you want to highlight because it represents records that may need to be followed up on.
Aggregates with NULLs
Get correct totals despite missing values
Beyond COUNT, other aggregate functions also interact with NULL values in ways that can quietly affect your results.
Other Aggregate Behaviors
SUM and AVG skip NULL values entirely. They don't treat NULL as zero; they pretend those rows don't exist for the calculation.
The SUM is 10 + 15 + 25 = 50 (NULL values ignored). The AVG is 50 / 3 = 16.67 (dividing by 3 non-NULL values, not 5 total rows).
AVG to treat missing discounts as zero, you'll get wrong results. A 16.67 average looks healthy, but the true average across all orders (treating NULLs as 0) would be 50 / 5 = 10.MIN/MAX Skip NULLs
MIN and MAX also ignore NULL values when finding extremes in a column.
MIN and MAX only consider non-NULL values. This is usually the desired behavior, but be aware that it means "minimum discount applied" not "minimum discount overall."
All NULLs: Special Case
What happens when ALL values in a column are NULL?
When all values are NULL, SUM and AVG return NULL (not zero), while COUNT returns 0. This distinction can cause issues if you later do calculations with these results.
Use NULL checks to audit data completeness:
Incomplete Records
Finding Missing Data
Use the IS NULL operator to identify rows where a column has no value assigned.
> Complete this query to find how many orders are missing a discount value.
SELECT - AS missing_discounts FROM orders
The fact that SUM and AVG silently skip NULLs rather than treating them as zero means you must always be clear about whether your business definition of "average discount" includes or excludes orders that had no discount.
When aggregate functions return NULL because all input values are NULL, you can protect downstream calculations by wrapping the aggregate with COALESCE to substitute a sensible default.
Data audits using COUNT(*) minus COUNT(column) are one of the fastest ways to quantify data quality issues in a table and communicate the scope of missing data to the team responsible for fixing it.
> You are a data analyst at HubSpot preparing a contact completeness report before a CRM data quality sprint. Marketing needs to know which fields are missing so the team can prioritize outreach campaigns on records that are actually usable.
IS NULL pinpoints contacts missing phone or email, surfacing exactly which records block outreach campaigns.WHERE phone = NULL silently drops rows that should appear in your report.COUNT(*) minus COUNT(column) quantifies incomplete records per field, giving the team a prioritized gap list.SUM and AVG silently skip NULLs, so aggregate completeness scores only reflect contacts with actual values.NULL means "unknown" or "not provided"; it is not the same as zero or empty string= NULL or != NULL; always use IS NULL or IS NOT NULLNULL returns UNKNOWN, which is treated as FALSE in WHERECOUNT(*) counts all rows including NULLs; COUNT(column) counts only non-NULL valuesSUM, AVG, MIN, MAX all ignore NULL values; they operate only on non-NULL valuesNULL in arithmetic: any operation with NULL returns NULLNULL lets databases represent this truthNULL prevents silent data loss and incorrect calculationsNULL: the billion dollar mistake
- Category
- SQL
- Difficulty
- beginner
- Duration
- 25 minutes
- Challenges
- 3 hands-on challenges
Topics covered: NULL vs zero vs empty string, Three-Valued Logic, IS NULL and IS NOT NULL operators, COUNT(*) vs COUNT(column), Aggregates with NULLs
Lesson Sections
- NULL vs zero vs empty string (concepts: sqlNullPropagation)
Understanding NULL These three concepts are often confused, but they represent completely different things: Purpose of NULL
- Three-Valued Logic (concepts: sqlThreeValuedLogic)
Three-Valued Logic This is why the following comparisons don't work as you might expect: WHERE Clause Behavior IS NULL vs = NULL
- IS NULL and IS NOT NULL operators (concepts: sqlIsNull)
Basic NULL Checks IS NOT NULL Common Mistakes This is one of the most common SQL mistakes:
- COUNT(*) vs COUNT(column)
COUNT Behaviors COUNT Behaviors
- Aggregates with NULLs
Other Aggregate Behaviors MIN/MAX Skip NULLs All NULLs: Special Case This quickly shows how complete your data is for different columns. Incomplete Records Finding records with missing data is essential for data quality monitoring. This finds pending orders that can't ship because they're missing an address. Finding Missing Data