SQL Practice

SQL PIVOT and UNPIVOT for Data Engineer Interviews

Pivoting is the bridge between how data is stored (normalized rows) and how humans read it (cross-tab columns). Interviewers test it because it combines conditional logic, aggregation, and engine awareness in a single question. You need to know native PIVOT syntax, the portable CASE WHEN approach, UNPIVOT for the reverse direction, and when dynamic pivot is the right answer.

Four Ways to Pivot (and Unpivot) Data

Each approach has tradeoffs. Know all four so you can pick the right one for the engine and constraints the interviewer gives you.

Native PIVOT (SQL Server / Snowflake)

SQL Server, Snowflake, Oracle

SQL Server and Snowflake support a dedicated PIVOT operator. You specify the aggregate function, the column whose values become new columns, and the list of values to pivot on. The syntax is concise but rigid: you must list every pivot value at write time. If a new category appears in the data, your query does not pick it up automatically.

SELECT *
FROM monthly_sales
PIVOT (
  SUM(revenue)
  FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
) AS p;

Limitation: Pivot values must be hardcoded. Dynamic pivot requires dynamic SQL.

CASE WHEN Pivot (All Engines)

All SQL engines

The portable approach. Wrap CASE WHEN inside an aggregate function (SUM, MAX, COUNT) for each target column. This works in PostgreSQL, MySQL, BigQuery, Redshift, and every other engine. It is more verbose than native PIVOT, but interviewers prefer it because it proves you understand the mechanics. You are not hiding behind syntax; you are showing how a pivot actually works.

SELECT
  product_id,
  SUM(CASE WHEN quarter = 'Q1' THEN revenue ELSE 0 END) AS q1_revenue,
  SUM(CASE WHEN quarter = 'Q2' THEN revenue ELSE 0 END) AS q2_revenue,
  SUM(CASE WHEN quarter = 'Q3' THEN revenue ELSE 0 END) AS q3_revenue,
  SUM(CASE WHEN quarter = 'Q4' THEN revenue ELSE 0 END) AS q4_revenue
FROM monthly_sales
GROUP BY product_id;

Limitation: Same hardcoding issue: you list each value manually. But every SQL engine supports it.

UNPIVOT (Columns to Rows)

SQL Server, Snowflake, Oracle

UNPIVOT is the reverse: it takes multiple columns and folds them back into rows. Given a table with q1_revenue, q2_revenue, q3_revenue, q4_revenue columns, UNPIVOT produces one row per quarter per product. This is useful for normalizing wide denormalized tables back into a format suitable for aggregation or visualization.

SELECT product_id, quarter, revenue
FROM quarterly_sales
UNPIVOT (
  revenue FOR quarter IN (q1_revenue, q2_revenue, q3_revenue, q4_revenue)
) AS u;

Limitation: All source columns must share the same data type. NULLs are dropped by default.

UNION ALL Unpivot (Portable)

All SQL engines

When your engine lacks UNPIVOT, use UNION ALL. Select one column at a time with a literal label for the new category column. This is verbose but works everywhere and makes the transformation explicit. Each SELECT grabs one column and labels it; the UNION ALL stacks them vertically.

SELECT product_id, 'Q1' AS quarter, q1_revenue AS revenue FROM quarterly_sales
UNION ALL
SELECT product_id, 'Q2', q2_revenue FROM quarterly_sales
UNION ALL
SELECT product_id, 'Q3', q3_revenue FROM quarterly_sales
UNION ALL
SELECT product_id, 'Q4', q4_revenue FROM quarterly_sales;

Limitation: Each UNION ALL branch scans the table again. Performance degrades with many columns.

Worked Example: Pivot Monthly Revenue by Product

Given a sales table with product, month, and revenue, produce a report with one row per product and one column per month.

SELECT
  product,
  COALESCE(SUM(CASE WHEN month = 'Jan' THEN revenue END), 0) AS jan,
  COALESCE(SUM(CASE WHEN month = 'Feb' THEN revenue END), 0) AS feb,
  COALESCE(SUM(CASE WHEN month = 'Mar' THEN revenue END), 0) AS mar,
  SUM(revenue) AS total
FROM sales
GROUP BY product
ORDER BY total DESC;

COALESCE wraps each pivot column so products with no sales in a given month show 0 instead of NULL. The total column uses a plain SUM for the full-year figure.

Expected output

 product   | jan    | feb    | mar    | total
-----------+--------+--------+--------+--------
 Widget A  | 12400  | 15200  |  9800  | 37400
 Widget B  |  8100  |     0  | 11300  | 19400
 Widget C  |     0  |  6700  |  7200  | 13900

Worked Example: Unpivot a Wide Table

You receive a denormalized export with columns for each quarter. Normalize it into (product, quarter, sales) rows for loading into your data warehouse.

SELECT product, 'Q1' AS quarter, q1_sales AS sales FROM wide_sales
UNION ALL
SELECT product, 'Q2', q2_sales FROM wide_sales
UNION ALL
SELECT product, 'Q3', q3_sales FROM wide_sales
UNION ALL
SELECT product, 'Q4', q4_sales FROM wide_sales
ORDER BY product, quarter;

Each SELECT grabs one column and labels it. This is the pattern for normalizing Excel exports or CSV files with one column per time period.

Dynamic Pivot: The Question That Catches Candidates

Standard PIVOT and CASE WHEN both require you to enumerate every value at query time. When the values are unknown or change frequently, you need a different approach. Interviewers ask about dynamic pivot to test whether you understand SQL's static column model.

SQL result sets have fixed schemas. Column names must be known before the query executes. This means a pivot query with FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4') will break if the data contains 'Q5'. The correct interview answer is not a clever workaround. It is acknowledging the limitation and explaining how production systems handle it.

Three strategies: (1) Generate the SQL dynamically in application code or a stored procedure: query the distinct values, build the CASE WHEN expressions as a string, then execute. (2) Use a tool like dbt with Jinja templates. (3) Skip the pivot in SQL and return key-value rows, letting the BI tool or pandas handle the reshaping.

-- Snowflake supports a subquery in the IN clause
SELECT *
FROM monthly_sales
PIVOT (
  SUM(revenue)
  FOR quarter IN (SELECT DISTINCT quarter FROM monthly_sales)
) AS p;

-- SQL Server: build the query dynamically
DECLARE @cols NVARCHAR(MAX);
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ', ')
FROM (SELECT DISTINCT quarter FROM monthly_sales) AS q;

DECLARE @sql NVARCHAR(MAX) = '
SELECT product_id, ' + @cols + '
FROM monthly_sales
PIVOT (SUM(revenue) FOR quarter IN (' + @cols + ')) AS p';
EXEC sp_executesql @sql;

7 Pivot Interview Questions

These span key-value pivots, cross-tabs, dynamic pivots, and unpivot normalization. The approach notes show what the interviewer expects.

Q1: Given a key-value table (user_id, attribute_name, attribute_value), pivot it so each attribute becomes its own column.

What they test:

This is the most common pivot question. The interviewer wants the CASE WHEN + MAX/GROUP BY pattern. If you use native PIVOT syntax, they may ask you to rewrite it without PIVOT to prove you understand the mechanics.

Approach:

Use MAX(CASE WHEN attribute_name = 'email' THEN attribute_value END) for each attribute, grouped by user_id. MAX works because each user has at most one value per attribute. If duplicates are possible, clarify the dedup strategy first.

Q2: A table stores monthly metrics as rows: (metric_name, month, value). Create a report with one row per metric and columns for Jan through Dec.

What they test:

Scale of the pivot. Twelve CASE WHEN branches is tedious but correct. The interviewer checks whether you stay organized or make typos under pressure. Some candidates try to be clever with dynamic SQL; unless they ask for it, keep it simple.

Approach:

SUM(CASE WHEN month = 'Jan' THEN value ELSE 0 END) AS jan, repeated for each month. GROUP BY metric_name. Use ELSE 0 so missing months show zero, not NULL.

Q3: You have a wide table with columns for each day of the week (mon_hours, tue_hours, ...). Normalize it into (employee_id, day_name, hours).

What they test:

UNPIVOT recognition. The interviewer wants to see if you can go from columns to rows. Many candidates only practice row-to-column pivoting and freeze when the direction reverses.

Approach:

UNPIVOT if available, otherwise UNION ALL with a literal label for each day. Mention that you would validate all source columns share the same type before unpivoting.

Q4: Write a query that counts orders by status (pending, shipped, delivered, returned) and shows each as a column in a single row.

What they test:

Conditional aggregation with COUNT. This is a pivot disguised as a simple aggregation. The trap is using COUNT(CASE WHEN status = 'pending' THEN 1 END) rather than SUM(CASE WHEN ... THEN 1 ELSE 0 END). Both work, but COUNT ignores NULLs while SUM counts zeros.

Approach:

COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending_count. Repeat for each status. No GROUP BY needed if you want a single summary row.

Q5: The product team adds new categories frequently. How would you handle a pivot where the list of values is unknown at query time?

What they test:

Dynamic pivot awareness. The interviewer does not necessarily expect you to write dynamic SQL on the spot. They want to hear that you understand the limitation of static pivots and know the tradeoffs: dynamic SQL in stored procedures, generating the query in application code (Python, dbt macros), or using a BI tool that handles pivoting at render time.

Approach:

Acknowledge that standard SQL cannot pivot on unknown values. Propose generating the query dynamically: query DISTINCT values first, then build the CASE WHEN branches programmatically. In Snowflake, mention that PIVOT supports subqueries for the IN clause.

Q6: Given a survey_responses table (respondent_id, question_id, answer), create a cross-tab showing each respondent as a row and each question as a column.

What they test:

This is the key-value pivot again, but with potentially hundreds of questions. The interviewer wants you to discuss feasibility: at what point does a SQL pivot become impractical and you should use a different tool (pandas pivot_table, Spark, a BI layer)?

Approach:

For a bounded number of questions (under 20), use the CASE WHEN + MAX pattern. For unbounded questions, recommend pivoting in application code or a downstream tool.

Q7: Convert a pivot query into an UNPIVOT and explain when you would use each direction in a real pipeline.

What they test:

Bidirectional fluency. Many candidates learn pivot but not unpivot. The interviewer wants to hear real use cases: pivot for reporting (humans read wide tables), unpivot for ingestion into a normalized data model (machines process long tables).

Approach:

Show both directions. Pivot: row-to-column for dashboards. Unpivot: column-to-row for loading wide CSV exports into a normalized schema. Mention that ELT pipelines often unpivot raw data during the transform step.

Common Pivot Mistakes

These trip up candidates in live interviews.

Forgetting ELSE 0 in a SUM pivot

SUM(CASE WHEN region = 'US' THEN revenue END) returns NULL for groups with no US rows. SUM(CASE WHEN region = 'US' THEN revenue ELSE 0 END) returns 0. The NULL version can break downstream calculations or produce confusing reports.

Using COUNT when you mean SUM

COUNT(CASE WHEN ... THEN value END) counts non-NULL values, which equals the number of matching rows. SUM(CASE WHEN ... THEN value ELSE 0 END) totals the actual values. If you want a sum, use SUM. If you want a count, use COUNT(CASE WHEN ... THEN 1 END).

Missing GROUP BY after adding CASE WHEN aggregations

When you add SUM(CASE WHEN ...) columns, every non-aggregated column in SELECT must appear in GROUP BY. Under interview pressure, candidates add the CASE WHEN columns but forget to update GROUP BY.

Assuming PIVOT is portable across engines

PIVOT works in SQL Server, Snowflake, and Oracle. It does not exist in PostgreSQL, MySQL, or BigQuery. If the interviewer specifies PostgreSQL, go straight to CASE WHEN.

SQL PIVOT FAQ

What is SQL PIVOT?+
PIVOT transforms unique values from one column into multiple columns in the output, aggregating data in the process. If you have rows for Q1, Q2, Q3, Q4, PIVOT turns those into four separate columns with aggregated values. It is rotating rows into columns.
Which databases support PIVOT natively?+
SQL Server, Oracle, and Snowflake support the PIVOT keyword. PostgreSQL, MySQL, BigQuery, and Redshift do not. For engines without native PIVOT, use CASE WHEN inside aggregate functions for the same result.
When should I use PIVOT vs CASE WHEN?+
If your interviewer specifies SQL Server or Snowflake, native PIVOT is cleaner. If they specify PostgreSQL or do not specify an engine, use CASE WHEN. In interviews, CASE WHEN is generally preferred because it demonstrates that you understand the mechanics.
What is the difference between PIVOT and UNPIVOT?+
PIVOT converts rows into columns (making the table wider). UNPIVOT converts columns into rows (making the table longer). PIVOT is for creating cross-tab reports. UNPIVOT is for normalizing denormalized data. They are inverse operations.
Can I PIVOT on multiple columns?+
Not with a single PIVOT clause in most engines. To pivot on two dimensions, concatenate the values into a composite key first, then pivot on the composite. The CASE WHEN approach is more flexible for multi-dimensional pivots.

Pivot Under Pressure

Reading pivot syntax is easy. Writing a six-column CASE WHEN pivot with correct GROUP BY under time pressure is the real test. Practice with real SQL execution and instant feedback.