SQL PIVOT: Rows to Columns for Data Engineer Interviews

The PIVOT keyword arrived in T-SQL with SQL Server 2005, borrowing the shape of Oracle's model clause that had shipped the year prior. Before that, rotating rows into columns meant hand-rolled CASE WHEN sums stacked inside a GROUP BY, sometimes twenty lines deep for a quarterly report. Two decades later, Postgres still doesn't ship a PIVOT keyword, and most interviewers still ask you to write the CASE WHEN version anyway.

2005
PIVOT lands in T-SQL
1999
SQL:1999 adds CASE
0
Postgres PIVOT keywords
41%
Interviews that are SQL
Source: DataDriven analysis of 1,042 verified data engineering interview rounds.

What PIVOT Does

A pivot takes distinct values from one column and turns each value into its own column, filling cells with an aggregated result. The input is long (many rows, few columns). The output is wide (fewer rows, many columns). That's the entire concept. The canonical example hasn't changed since Ralph Kimball was writing about star schemas in the 90s: a sales table with one row per product per quarter. Pivoting collapses four rows per product into a single row with four revenue columns. The aggregate (SUM, COUNT, AVG) decides what lands in each cell.

Before pivot (tall)

product  | quarter | revenue
---------+---------+--------
Widget A | Q1      | 12400
Widget A | Q2      | 15200
Widget A | Q3      |  9800
Widget A | Q4      | 11500
Widget B | Q1      |  8100
Widget B | Q2      |  6300
Widget B | Q3      | 11300
Widget B | Q4      |  9400

After pivot (wide)

product  | Q1    | Q2    | Q3    | Q4
---------+-------+-------+-------+------
Widget A | 12400 | 15200 |  9800 | 11500
Widget B |  8100 |  6300 | 11300 |  9400
Free forever
Practice pivot queries problems from real interviews.

Every problem comes from a real interview report. Run code in your browser.

Interview note: When an interviewer says 'pivot this table,' they almost always want the CASE WHEN + GROUP BY pattern, not the PIVOT keyword. CASE WHEN proves you understand the mechanics. The keyword just hides them.

PIVOT Techniques by Engine

Not every SQL engine supports the PIVOT keyword. Here's how to write a pivot on each major platform, with real code you can run.

PostgreSQL: CASE WHEN + GROUP BY

PostgreSQL, MySQL, Redshift

PostgreSQL has no PIVOT keyword. The standard approach is conditional aggregation: wrap a CASE expression inside an aggregate function for each target column, then GROUP BY the row identifier. This pattern works on every SQL engine, which is why interviewers default to it.

SELECT
  product,
  SUM(CASE WHEN quarter = 'Q1' THEN revenue ELSE 0 END) AS q1,
  SUM(CASE WHEN quarter = 'Q2' THEN revenue ELSE 0 END) AS q2,
  SUM(CASE WHEN quarter = 'Q3' THEN revenue ELSE 0 END) AS q3,
  SUM(CASE WHEN quarter = 'Q4' THEN revenue ELSE 0 END) AS q4
FROM sales
GROUP BY product;

PostgreSQL also provides a crosstab() function from the tablefunc extension. It's useful for production reports but rarely expected in interviews.

-- Enable the extension first
CREATE EXTENSION IF NOT EXISTS tablefunc;

SELECT * FROM crosstab(
  'SELECT product, quarter, revenue
   FROM sales ORDER BY 1, 2',
  $$VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4')$$
) AS ct(product TEXT, q1 INT, q2 INT, q3 INT, q4 INT);

SQL Server: Native PIVOT Syntax

SQL Server, Oracle

SQL Server introduced PIVOT in SQL Server 2005. You specify the aggregate function, the column whose values become headers, and the list of values to pivot on. The syntax is concise but rigid: every pivot value must be listed explicitly. If a new quarter appears in the data, the query won't pick it up automatically.

SELECT product, [Q1], [Q2], [Q3], [Q4]
FROM (
  SELECT product, quarter, revenue
  FROM sales
) AS src
PIVOT (
  SUM(revenue)
  FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS pvt;

Watch out: Watch out: The inner subquery controls which columns appear in the output. If you SELECT * from the source table and it has extra columns, those become implicit GROUP BY keys, which produces unexpected duplicate rows. Always limit the inner subquery to exactly the columns you need: the row ID, the pivot column, and the value column.

BigQuery: PIVOT Keyword (2022+)

BigQuery

BigQuery added PIVOT support in 2022. The syntax is similar to SQL Server but uses BigQuery's function set. Before 2022, you had to use CASE WHEN. Many tutorials and interview prep materials still show the old approach, so know both.

SELECT *
FROM (
  SELECT product, quarter, revenue
  FROM sales
)
PIVOT (
  SUM(revenue)
  FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
);

Snowflake: PIVOT with Subquery Support

Snowflake

Snowflake supports PIVOT and goes one step further: the IN clause accepts a subquery. This is the closest any major engine gets to true dynamic pivot in pure SQL, without building a query string at runtime.

-- Static pivot
SELECT *
FROM sales
PIVOT (
  SUM(revenue) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
) AS p;

-- Dynamic pivot (Snowflake-only)
SELECT *
FROM sales
PIVOT (
  SUM(revenue)
  FOR quarter IN (SELECT DISTINCT quarter FROM sales)
) AS p;

Interview tip: If the interviewer says 'Snowflake,' mention the subquery-in-PIVOT capability. It shows you know engine-specific features beyond the textbook.

Dynamic PIVOT: Handling Unknown Column Values

Both PIVOT and CASE WHEN require you to type out every value that becomes a column. That works for quarters (you know there are four) but falls apart when the values are unpredictable. SQL result sets have fixed schemas. The column names and count are determined at parse time, before any data is read. This is a fundamental language constraint, not a missing feature. Interviewers ask about dynamic pivot to test whether you understand this. The correct answer is not a single magic query. It's acknowledging the limitation and explaining how real pipelines handle it.

Strategy 1: Generate SQL in Application Code

Query the distinct values first. Build the CASE WHEN clauses as a string in Python, Java, or whatever runs your pipeline. Execute the generated query. This is how most production ETL handles it.

# Python example: dynamic pivot generation
import pandas as pd
from sqlalchemy import text

engine = create_engine(conn_string)

# Step 1: get distinct values
categories = pd.read_sql(
    "SELECT DISTINCT category FROM products ORDER BY 1",
    engine
)["category"].tolist()

# Step 2: build the pivot query
cases = ", ".join([
    f"SUM(CASE WHEN category = '{c}' "
    f"THEN amount ELSE 0 END) AS {c.lower().replace(' ', '_')}"
    for c in categories
])
query = f"SELECT region, {cases} FROM sales GROUP BY region"

# Step 3: execute
result = pd.read_sql(query, engine)

Strategy 2: Dynamic SQL in a Stored Procedure (SQL Server)

DECLARE @cols NVARCHAR(MAX);
DECLARE @sql  NVARCHAR(MAX);

-- Collect distinct values
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ', ')
FROM (SELECT DISTINCT quarter FROM sales) AS q;

-- Build and execute the pivot query
SET @sql = N'
  SELECT product, ' + @cols + N'
  FROM (SELECT product, quarter, revenue FROM sales) AS src
  PIVOT (SUM(revenue) FOR quarter IN (' + @cols + N')) AS pvt';

EXEC sp_executesql @sql;

Strategy 3: dbt Jinja Macros

In dbt, you can call run_query() during compilation to fetch distinct values, then loop over them in Jinja to generate the CASE WHEN columns. The compiled SQL is static, but the template adapts to whatever's in the table.

{# dbt Jinja macro for dynamic pivot #}
{% set categories_query %}
  SELECT DISTINCT category FROM {{ ref('products') }}
{% endset %}
{% set categories = run_query(categories_query).columns[0].values() %}

SELECT
  region,
  {% for cat in categories %}
  SUM(CASE WHEN category = '{{ cat }}'
      THEN amount ELSE 0 END) AS {{ cat | lower | replace(' ', '_') }}
  {% if not loop.last %},{% endif %}
  {% endfor %}
FROM {{ ref('sales') }}
GROUP BY region

Strategy 4: Skip the Pivot, Reshape Downstream

Return the data in its normalized form (key-value rows) and let the BI tool, Pandas, or Spark do the reshaping. Looker, Tableau, and Metabase all support column pivoting at render time. This keeps your SQL clean and the pipeline resilient to new values.

Interview note: Interviewers don't expect you to write dynamic SQL on a whiteboard. They want to hear that you understand why a static pivot can't adapt, and that you know at least two strategies for handling it in production.

UNPIVOT: The Reverse Operation

UNPIVOT takes a wide table and folds its columns back into rows. Given a table with q1_revenue, q2_revenue, q3_revenue, q4_revenue columns, UNPIVOT produces one row per quarter per product. This comes up constantly in real pipelines: someone sends you an Excel export with 52 weekly columns, and you need to normalize it for your warehouse. Use case pattern: pivot for humans (dashboards want wide tables), unpivot for machines (normalized tables are easier to join, filter, and aggregate).

Input: wide table

product  | q1   | q2   | q3   | q4
---------+------+------+------+------
Widget A | 12400| 15200|  9800| 11500
Widget B |  8100|  6300| 11300|  9400

Output: normalized rows

product  | quarter | revenue
---------+---------+--------
Widget A | q1      | 12400
Widget A | q2      | 15200
Widget A | q3      |  9800
Widget A | q4      | 11500
Widget B | q1      |  8100
...
Native UNPIVOT (SQL Server, Snowflake, Oracle):
SELECT product, quarter, revenue
FROM quarterly_sales
UNPIVOT (
  revenue FOR quarter IN (q1, q2, q3, q4)
) AS u;

-- Gotcha: all source columns must share the same data type
-- NULLs are dropped by default; use UNION ALL to preserve them
UNION ALL unpivot works on all engines and preserves NULLs:
SELECT product, 'Q1' AS quarter, q1 AS revenue
FROM quarterly_sales
UNION ALL
SELECT product, 'Q2', q2 FROM quarterly_sales
UNION ALL
SELECT product, 'Q3', q3 FROM quarterly_sales
UNION ALL
SELECT product, 'Q4', q4 FROM quarterly_sales
ORDER BY product, quarter;
Snowflake LATERAL FLATTEN alternative for large column counts:
SELECT
  product,
  f.value:"quarter"::STRING AS quarter,
  f.value:"revenue"::INT    AS revenue
FROM quarterly_sales,
LATERAL FLATTEN(
  input => ARRAY_CONSTRUCT(
    OBJECT_CONSTRUCT('quarter','Q1','revenue',q1),
    OBJECT_CONSTRUCT('quarter','Q2','revenue',q2),
    OBJECT_CONSTRUCT('quarter','Q3','revenue',q3),
    OBJECT_CONSTRUCT('quarter','Q4','revenue',q4)
  )
) f;

Pivot Interview Questions

Q1: Given a key-value table (user_id, attribute_name, attribute_value), pivot it so each attribute becomes its own column.

What they test: The most common pivot question. They want CASE WHEN + MAX + GROUP BY. If you use native PIVOT syntax, expect a follow-up: 'Now do it without the PIVOT keyword.' Approach: MAX(CASE WHEN attribute_name = 'email' THEN attribute_value END) for each attribute, grouped by user_id. MAX works because each user has at most one value per attribute. If duplicates exist, clarify the dedup strategy before writing.

Q2: A metrics table stores (metric_name, month, value). Create a report with one row per metric and columns for Jan through Dec.

What they test: Scale of the pivot. Twelve CASE WHEN branches is tedious but correct. The interviewer checks if you stay organized or make typos under pressure. Approach: SUM(CASE WHEN month = 'Jan' THEN value ELSE 0 END) AS jan, repeated for each month. GROUP BY metric_name. Use ELSE 0 so missing months show zero, not NULL.

Q3: You have a wide table with day-of-week columns (mon_hours, tue_hours, ...). Normalize it into (employee_id, day_name, hours).

What they test: UNPIVOT recognition. Many candidates only practice row-to-column pivoting and freeze when the direction reverses. Approach: UNPIVOT if the engine supports it, otherwise UNION ALL with a literal label per day. Mention that all source columns must share the same type.

Q4: Write a query that counts orders by status (pending, shipped, delivered, returned) as separate columns in a single row.

What they test: Conditional aggregation with COUNT. This is a pivot disguised as simple aggregation. The trap: COUNT(CASE WHEN status = 'pending' THEN 1 END) counts non-NULL entries. SUM with ELSE 0 totals values. Both work, but they behave differently with edge cases. Approach: COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending_count. Repeat per status. No GROUP BY needed for a single summary row.

Q5: The product team adds categories frequently. How do you pivot when the values aren't known at query time?

What they test: Dynamic pivot awareness. They don't expect you to write dynamic SQL on a whiteboard. They want to hear that standard SQL can't do it, and that you know how production systems handle it. Approach: Acknowledge the static schema limitation. Propose generating the query in application code or a dbt Jinja macro. Mention Snowflake's subquery-in-PIVOT if the engine is relevant. Alternatively, return key-value rows and let the BI tool pivot at render time.

Q6: Pivot a table that needs two different aggregations: both SUM of revenue and COUNT of orders per category.

What they test: Multi-aggregate pivot. Native PIVOT only supports one aggregate per PIVOT clause. Candidates who only know the keyword get stuck here. Approach: Use CASE WHEN for both: SUM(CASE WHEN ...) for revenue and COUNT(CASE WHEN ...) for orders, all in the same SELECT with a single GROUP BY. This is where CASE WHEN beats native PIVOT syntax, since PIVOT only allows one aggregate function at a time.

Common PIVOT Mistakes

These trip up candidates in live interviews. Each one is easy to avoid if you know it's coming.

Forgetting ELSE 0 in a SUM Pivot

SUM(CASE WHEN region = 'US' THEN revenue END) returns NULL when no US rows exist for a group. Add ELSE 0 to get a zero instead. NULLs break downstream math and produce confusing report output.

-- Bad: returns NULL
SUM(CASE WHEN region = 'US'
    THEN revenue END)

-- Good: returns 0
SUM(CASE WHEN region = 'US'
    THEN revenue ELSE 0 END)

Extra Columns in PIVOT's Inner Subquery

In SQL Server and Snowflake, any column in the PIVOT subquery that isn't the pivot column or the value column becomes an implicit GROUP BY key. If your inner query selects *, a timestamp or ID column forces one row per original row instead of one row per group.

-- Bad: extra columns = extra groups
SELECT * FROM sales
PIVOT (...) AS p;

-- Good: limit to needed columns
SELECT * FROM (
  SELECT product, quarter, revenue
  FROM sales
) AS src
PIVOT (...) AS p;

Missing GROUP BY After Adding CASE WHEN Columns

When you add SUM(CASE WHEN ...) columns, every non-aggregated column in SELECT must be in GROUP BY. Under pressure, candidates add the pivot columns but forget to update the GROUP BY clause. The error message is clear, but the fix costs you time and confidence.

Assuming PIVOT Is Portable

PIVOT works in SQL Server, Snowflake, Oracle, and (since 2022) BigQuery. It does not exist in PostgreSQL, MySQL, SQLite, or Redshift. If the interviewer says 'PostgreSQL,' don't reach for PIVOT. Go straight to CASE WHEN. If they don't specify an engine, CASE WHEN is always the safe choice.

Frequently asked questions

What does SQL PIVOT do?+
PIVOT rotates unique values from one column into multiple output columns, applying an aggregate function in the process. If your table has rows for Q1, Q2, Q3, Q4, PIVOT produces four separate columns with aggregated values. It turns a tall, narrow result into a wide, human-readable cross-tab.
Which databases support the PIVOT keyword?+
SQL Server, Oracle, and Snowflake support PIVOT as a keyword. PostgreSQL, MySQL, SQLite, BigQuery, and Redshift do not. For those engines, you write the same transformation manually using CASE WHEN inside aggregate functions grouped by the row identifier.
How do I pivot in PostgreSQL without PIVOT?+
Use conditional aggregation: SUM(CASE WHEN category = 'X' THEN value ELSE 0 END) AS x_total, repeated for each category. GROUP BY the row identifier. PostgreSQL also offers the crosstab() function from the tablefunc extension, but most interviewers expect the CASE WHEN approach because it works on every engine.
What is the difference between PIVOT and UNPIVOT?+
PIVOT converts rows into columns (wide output). UNPIVOT converts columns into rows (tall output). They're inverse operations. PIVOT is for dashboards and reports. UNPIVOT is for normalizing wide CSV exports or denormalized tables back into a shape suitable for joins and aggregation.
Can I PIVOT on values that aren't known at query time?+
Standard SQL requires column names to be fixed before execution, so you can't write a single static query that adapts to new values. The workaround is dynamic SQL: query the distinct values first, build the CASE WHEN expressions programmatically, then execute the generated query. Snowflake supports a subquery inside the PIVOT IN clause, which is the closest to true dynamic pivot in pure SQL.

Twenty years of PIVOT, one hour to prove it

The keyword turned twenty last year. The CASE WHEN pattern it replaced is older still and hasn't lost a step. Write both against a real engine and you'll stop confusing their histories at 2 a.m. the night before an interview.

More reading