SQL PIVOT: Rows to Columns for Data Engineer Interviews
The PIVOT keyword arrived in T-SQL with SQL Server 2005, borrowing the shape of Oracle's model clause that had shipped the year prior. Before that, rotating rows into columns meant hand-rolled CASE WHEN sums stacked inside a GROUP BY, sometimes twenty lines deep for a quarterly report. Two decades later, Postgres still doesn't ship a PIVOT keyword, and most interviewers still ask you to write the CASE WHEN version anyway.
What PIVOT Does
Before pivot (tall)
product | quarter | revenue ---------+---------+-------- Widget A | Q1 | 12400 Widget A | Q2 | 15200 Widget A | Q3 | 9800 Widget A | Q4 | 11500 Widget B | Q1 | 8100 Widget B | Q2 | 6300 Widget B | Q3 | 11300 Widget B | Q4 | 9400
After pivot (wide)
product | Q1 | Q2 | Q3 | Q4 ---------+-------+-------+-------+------ Widget A | 12400 | 15200 | 9800 | 11500 Widget B | 8100 | 6300 | 11300 | 9400
Every problem comes from a real interview report. Run code in your browser.
PIVOT Techniques by Engine
PostgreSQL: CASE WHEN + GROUP BY
PostgreSQL, MySQL, Redshift
PostgreSQL has no PIVOT keyword. The standard approach is conditional aggregation: wrap a CASE expression inside an aggregate function for each target column, then GROUP BY the row identifier. This pattern works on every SQL engine, which is why interviewers default to it.
SELECT
product,
SUM(CASE WHEN quarter = 'Q1' THEN revenue ELSE 0 END) AS q1,
SUM(CASE WHEN quarter = 'Q2' THEN revenue ELSE 0 END) AS q2,
SUM(CASE WHEN quarter = 'Q3' THEN revenue ELSE 0 END) AS q3,
SUM(CASE WHEN quarter = 'Q4' THEN revenue ELSE 0 END) AS q4
FROM sales
GROUP BY product;PostgreSQL also provides a crosstab() function from the tablefunc extension. It's useful for production reports but rarely expected in interviews.
-- Enable the extension first
CREATE EXTENSION IF NOT EXISTS tablefunc;
SELECT * FROM crosstab(
'SELECT product, quarter, revenue
FROM sales ORDER BY 1, 2',
$$VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4')$$
) AS ct(product TEXT, q1 INT, q2 INT, q3 INT, q4 INT);SQL Server: Native PIVOT Syntax
SQL Server, Oracle
SQL Server introduced PIVOT in SQL Server 2005. You specify the aggregate function, the column whose values become headers, and the list of values to pivot on. The syntax is concise but rigid: every pivot value must be listed explicitly. If a new quarter appears in the data, the query won't pick it up automatically.
SELECT product, [Q1], [Q2], [Q3], [Q4]
FROM (
SELECT product, quarter, revenue
FROM sales
) AS src
PIVOT (
SUM(revenue)
FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS pvt;Watch out: Watch out: The inner subquery controls which columns appear in the output. If you SELECT * from the source table and it has extra columns, those become implicit GROUP BY keys, which produces unexpected duplicate rows. Always limit the inner subquery to exactly the columns you need: the row ID, the pivot column, and the value column.
BigQuery: PIVOT Keyword (2022+)
BigQuery
BigQuery added PIVOT support in 2022. The syntax is similar to SQL Server but uses BigQuery's function set. Before 2022, you had to use CASE WHEN. Many tutorials and interview prep materials still show the old approach, so know both.
SELECT *
FROM (
SELECT product, quarter, revenue
FROM sales
)
PIVOT (
SUM(revenue)
FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
);Snowflake: PIVOT with Subquery Support
Snowflake
Snowflake supports PIVOT and goes one step further: the IN clause accepts a subquery. This is the closest any major engine gets to true dynamic pivot in pure SQL, without building a query string at runtime.
-- Static pivot
SELECT *
FROM sales
PIVOT (
SUM(revenue) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
) AS p;
-- Dynamic pivot (Snowflake-only)
SELECT *
FROM sales
PIVOT (
SUM(revenue)
FOR quarter IN (SELECT DISTINCT quarter FROM sales)
) AS p;Interview tip: If the interviewer says 'Snowflake,' mention the subquery-in-PIVOT capability. It shows you know engine-specific features beyond the textbook.
Dynamic PIVOT: Handling Unknown Column Values
Strategy 1: Generate SQL in Application Code
Query the distinct values first. Build the CASE WHEN clauses as a string in Python, Java, or whatever runs your pipeline. Execute the generated query. This is how most production ETL handles it.
# Python example: dynamic pivot generation
import pandas as pd
from sqlalchemy import text
engine = create_engine(conn_string)
# Step 1: get distinct values
categories = pd.read_sql(
"SELECT DISTINCT category FROM products ORDER BY 1",
engine
)["category"].tolist()
# Step 2: build the pivot query
cases = ", ".join([
f"SUM(CASE WHEN category = '{c}' "
f"THEN amount ELSE 0 END) AS {c.lower().replace(' ', '_')}"
for c in categories
])
query = f"SELECT region, {cases} FROM sales GROUP BY region"
# Step 3: execute
result = pd.read_sql(query, engine)Strategy 2: Dynamic SQL in a Stored Procedure (SQL Server)
DECLARE @cols NVARCHAR(MAX);
DECLARE @sql NVARCHAR(MAX);
-- Collect distinct values
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ', ')
FROM (SELECT DISTINCT quarter FROM sales) AS q;
-- Build and execute the pivot query
SET @sql = N'
SELECT product, ' + @cols + N'
FROM (SELECT product, quarter, revenue FROM sales) AS src
PIVOT (SUM(revenue) FOR quarter IN (' + @cols + N')) AS pvt';
EXEC sp_executesql @sql;Strategy 3: dbt Jinja Macros
In dbt, you can call run_query() during compilation to fetch distinct values, then loop over them in Jinja to generate the CASE WHEN columns. The compiled SQL is static, but the template adapts to whatever's in the table.
{# dbt Jinja macro for dynamic pivot #}
{% set categories_query %}
SELECT DISTINCT category FROM {{ ref('products') }}
{% endset %}
{% set categories = run_query(categories_query).columns[0].values() %}
SELECT
region,
{% for cat in categories %}
SUM(CASE WHEN category = '{{ cat }}'
THEN amount ELSE 0 END) AS {{ cat | lower | replace(' ', '_') }}
{% if not loop.last %},{% endif %}
{% endfor %}
FROM {{ ref('sales') }}
GROUP BY regionStrategy 4: Skip the Pivot, Reshape Downstream
Return the data in its normalized form (key-value rows) and let the BI tool, Pandas, or Spark do the reshaping. Looker, Tableau, and Metabase all support column pivoting at render time. This keeps your SQL clean and the pipeline resilient to new values.
UNPIVOT: The Reverse Operation
Input: wide table
product | q1 | q2 | q3 | q4 ---------+------+------+------+------ Widget A | 12400| 15200| 9800| 11500 Widget B | 8100| 6300| 11300| 9400
Output: normalized rows
product | quarter | revenue ---------+---------+-------- Widget A | q1 | 12400 Widget A | q2 | 15200 Widget A | q3 | 9800 Widget A | q4 | 11500 Widget B | q1 | 8100 ...
SELECT product, quarter, revenue
FROM quarterly_sales
UNPIVOT (
revenue FOR quarter IN (q1, q2, q3, q4)
) AS u;
-- Gotcha: all source columns must share the same data type
-- NULLs are dropped by default; use UNION ALL to preserve themSELECT product, 'Q1' AS quarter, q1 AS revenue
FROM quarterly_sales
UNION ALL
SELECT product, 'Q2', q2 FROM quarterly_sales
UNION ALL
SELECT product, 'Q3', q3 FROM quarterly_sales
UNION ALL
SELECT product, 'Q4', q4 FROM quarterly_sales
ORDER BY product, quarter;SELECT
product,
f.value:"quarter"::STRING AS quarter,
f.value:"revenue"::INT AS revenue
FROM quarterly_sales,
LATERAL FLATTEN(
input => ARRAY_CONSTRUCT(
OBJECT_CONSTRUCT('quarter','Q1','revenue',q1),
OBJECT_CONSTRUCT('quarter','Q2','revenue',q2),
OBJECT_CONSTRUCT('quarter','Q3','revenue',q3),
OBJECT_CONSTRUCT('quarter','Q4','revenue',q4)
)
) f;Pivot Interview Questions
Q1: Given a key-value table (user_id, attribute_name, attribute_value), pivot it so each attribute becomes its own column.
What they test: The most common pivot question. They want CASE WHEN + MAX + GROUP BY. If you use native PIVOT syntax, expect a follow-up: 'Now do it without the PIVOT keyword.' Approach: MAX(CASE WHEN attribute_name = 'email' THEN attribute_value END) for each attribute, grouped by user_id. MAX works because each user has at most one value per attribute. If duplicates exist, clarify the dedup strategy before writing.
Q2: A metrics table stores (metric_name, month, value). Create a report with one row per metric and columns for Jan through Dec.
What they test: Scale of the pivot. Twelve CASE WHEN branches is tedious but correct. The interviewer checks if you stay organized or make typos under pressure. Approach: SUM(CASE WHEN month = 'Jan' THEN value ELSE 0 END) AS jan, repeated for each month. GROUP BY metric_name. Use ELSE 0 so missing months show zero, not NULL.
Q3: You have a wide table with day-of-week columns (mon_hours, tue_hours, ...). Normalize it into (employee_id, day_name, hours).
What they test: UNPIVOT recognition. Many candidates only practice row-to-column pivoting and freeze when the direction reverses. Approach: UNPIVOT if the engine supports it, otherwise UNION ALL with a literal label per day. Mention that all source columns must share the same type.
Q4: Write a query that counts orders by status (pending, shipped, delivered, returned) as separate columns in a single row.
What they test: Conditional aggregation with COUNT. This is a pivot disguised as simple aggregation. The trap: COUNT(CASE WHEN status = 'pending' THEN 1 END) counts non-NULL entries. SUM with ELSE 0 totals values. Both work, but they behave differently with edge cases. Approach: COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending_count. Repeat per status. No GROUP BY needed for a single summary row.
Q5: The product team adds categories frequently. How do you pivot when the values aren't known at query time?
What they test: Dynamic pivot awareness. They don't expect you to write dynamic SQL on a whiteboard. They want to hear that standard SQL can't do it, and that you know how production systems handle it. Approach: Acknowledge the static schema limitation. Propose generating the query in application code or a dbt Jinja macro. Mention Snowflake's subquery-in-PIVOT if the engine is relevant. Alternatively, return key-value rows and let the BI tool pivot at render time.
Q6: Pivot a table that needs two different aggregations: both SUM of revenue and COUNT of orders per category.
What they test: Multi-aggregate pivot. Native PIVOT only supports one aggregate per PIVOT clause. Candidates who only know the keyword get stuck here. Approach: Use CASE WHEN for both: SUM(CASE WHEN ...) for revenue and COUNT(CASE WHEN ...) for orders, all in the same SELECT with a single GROUP BY. This is where CASE WHEN beats native PIVOT syntax, since PIVOT only allows one aggregate function at a time.
Common PIVOT Mistakes
Forgetting ELSE 0 in a SUM Pivot
SUM(CASE WHEN region = 'US' THEN revenue END) returns NULL when no US rows exist for a group. Add ELSE 0 to get a zero instead. NULLs break downstream math and produce confusing report output.
-- Bad: returns NULL
SUM(CASE WHEN region = 'US'
THEN revenue END)
-- Good: returns 0
SUM(CASE WHEN region = 'US'
THEN revenue ELSE 0 END)Extra Columns in PIVOT's Inner Subquery
In SQL Server and Snowflake, any column in the PIVOT subquery that isn't the pivot column or the value column becomes an implicit GROUP BY key. If your inner query selects *, a timestamp or ID column forces one row per original row instead of one row per group.
-- Bad: extra columns = extra groups
SELECT * FROM sales
PIVOT (...) AS p;
-- Good: limit to needed columns
SELECT * FROM (
SELECT product, quarter, revenue
FROM sales
) AS src
PIVOT (...) AS p;Missing GROUP BY After Adding CASE WHEN Columns
When you add SUM(CASE WHEN ...) columns, every non-aggregated column in SELECT must be in GROUP BY. Under pressure, candidates add the pivot columns but forget to update the GROUP BY clause. The error message is clear, but the fix costs you time and confidence.
Assuming PIVOT Is Portable
PIVOT works in SQL Server, Snowflake, Oracle, and (since 2022) BigQuery. It does not exist in PostgreSQL, MySQL, SQLite, or Redshift. If the interviewer says 'PostgreSQL,' don't reach for PIVOT. Go straight to CASE WHEN. If they don't specify an engine, CASE WHEN is always the safe choice.
Frequently asked questions
What does SQL PIVOT do?+
Which databases support the PIVOT keyword?+
How do I pivot in PostgreSQL without PIVOT?+
What is the difference between PIVOT and UNPIVOT?+
Can I PIVOT on values that aren't known at query time?+
Twenty years of PIVOT, one hour to prove it
The keyword turned twenty last year. The CASE WHEN pattern it replaced is older still and hasn't lost a step. Write both against a real engine and you'll stop confusing their histories at 2 a.m. the night before an interview.