SQL Practice

SQL GROUP BY and HAVING Practice

Aggregation is the single most tested SQL category, appearing in roughly 1 out of every 4 SQL interview questions. GROUP BY is the most-tested individual concept across all data engineering interviews. Five problems covering GROUP BY, HAVING, CASE WHEN inside aggregates, and multi-column grouping at interview difficulty.

#1
Most-Tested SQL Category
25%
Of SQL Interview Questions
5
Practice Problems Below

What You Need to Know

GROUP BY is the foundation of every analytical query. Based on DataDriven's analysis of verified interview data, aggregation questions outnumber JOINs, window functions, and every other SQL category. These four areas cover what interviewers test.

GROUP BY Basics

Core

GROUP BY collapses rows that share the same value(s) into a single output row. Every column in your SELECT must either appear in GROUP BY or be wrapped in an aggregate function (COUNT, SUM, AVG, MIN, MAX). This rule is enforced by every SQL engine and is the #1 source of GROUP BY errors.

SELECT department, COUNT(*) AS headcount FROM employees GROUP BY department

HAVING Filters

Core

HAVING filters groups after aggregation. WHERE filters rows before aggregation. This distinction is fundamental. If you need to find departments with more than 10 employees, you cannot put COUNT(*) > 10 in WHERE because the count has not been computed yet. HAVING exists for exactly this purpose.

SELECT department, AVG(salary) FROM employees GROUP BY department HAVING AVG(salary) > 80000

Conditional Aggregation with CASE WHEN

Core

CASE WHEN inside an aggregate function lets you count or sum only rows that meet a condition. This replaces multiple self-joins or subqueries. It is one of the most useful SQL patterns and appears in nearly every analytical interview. The classic form: SUM(CASE WHEN condition THEN 1 ELSE 0 END).

SELECT department, SUM(CASE WHEN status = 'active' THEN 1 ELSE 0 END) AS active_count, SUM(CASE WHEN status = 'churned' THEN 1 ELSE 0 END) AS churned_count FROM users GROUP BY department

Multi-Column Grouping

Core

GROUP BY multiple columns creates a group for each unique combination. GROUP BY year, quarter gives you one row per quarter per year. The order of columns in GROUP BY does not affect the result, but it affects readability. Put the broader grouping first (year before quarter, country before city).

SELECT year, quarter, region, SUM(revenue) FROM sales GROUP BY year, quarter, region

5 GROUP BY Practice Problems

Each problem adds a layer of difficulty. By problem 5, you are combining conditional aggregation with year-over-year calculations.

1

Revenue Per Category

Warm-up

Given an orders table with product_category and amount, calculate total revenue and order count per category. Sort by revenue descending. This is the baseline GROUP BY problem. If you cannot write this in under a minute, spend more time on fundamentals.

Skills tested: Basic GROUP BY, SUM, COUNT, ORDER BY

2

Departments Above Threshold

Intermediate

Find departments where the average salary exceeds $75,000 AND the headcount is at least 5. Return department name, average salary, and headcount. This tests whether you know the difference between WHERE and HAVING. Both conditions must go in HAVING because they reference aggregates.

Skills tested: HAVING with multiple conditions, AVG, COUNT

3

Monthly Active vs Churned Users

Intermediate

Given a user_events table with user_id, event_date, and event_type, calculate for each month: total unique users, users who had a purchase event, and users who had zero purchase events. Use conditional aggregation. One query, no subqueries, no self-joins.

Skills tested: COUNT(DISTINCT), conditional aggregation with CASE WHEN, date truncation

4

Conversion Funnel by Source

Advanced

Given a funnel_events table with user_id, event_stage (visit, signup, trial, paid), and source (organic, paid, referral), build a funnel showing the count at each stage per source. Use conditional aggregation to pivot stages into columns. Calculate the conversion rate from one stage to the next.

Skills tested: Multi-column GROUP BY, conditional aggregation, calculated columns, pivoting

5

Year-over-Year Growth by Segment

Advanced

Given a revenue table with date, segment, and amount, calculate for each segment: current year revenue, prior year revenue, and YoY growth percentage. Do this in a single query using conditional aggregation on the year. Handle segments that exist in one year but not the other.

Skills tested: Conditional aggregation across years, COALESCE for missing data, percentage calculations

Aggregation Mistakes That Cost You the Interview

These four errors are responsible for most GROUP BY failures in technical screens. Each one is easy to fix once you know the pattern.

Trap #1

Selecting a non-aggregated column not in GROUP BY

SELECT name, department, COUNT(*) FROM employees GROUP BY department. Which name does SQL pick when there are 50 employees per department? MySQL silently picks an arbitrary one. PostgreSQL, Trino, and BigQuery reject the query. Always aggregate or group every column in SELECT.

Trap #2

Putting aggregate conditions in WHERE instead of HAVING

WHERE COUNT(*) > 5 does not work. COUNT has not been computed when WHERE runs. WHERE filters individual rows before grouping. HAVING filters groups after aggregation. This comes up in interviews as a quick filter question.

Trap #3

Double-counting with JOINs before GROUP BY

If you join orders to order_items before grouping by customer, each order gets multiplied by its item count. SUM(amount) is now wrong. Either aggregate first then join, or use DISTINCT inside the aggregate: COUNT(DISTINCT order_id).

Trap #4

Forgetting NULLs form their own group

GROUP BY treats all NULLs as the same group. If your department column has NULLs, you get a row for the NULL group. This is correct SQL behavior, but it surprises people. Use COALESCE(department, 'Unknown') if you want a readable label.

Worked Example: Departments Above a Salary and Headcount Threshold

Find departments where the average salary exceeds $75,000 and the headcount is at least 5. Return the department name, average salary, and employee count.

SELECT
  d.name                          AS department,
  COUNT(*)                        AS headcount,
  ROUND(AVG(e.salary), 2)         AS avg_salary
FROM employees e
JOIN departments d ON d.id = e.department_id
GROUP BY d.id, d.name
HAVING COUNT(*) >= 5
   AND AVG(e.salary) > 75000
ORDER BY avg_salary DESC;

Both conditions go in HAVING because they reference aggregate results. Putting COUNT(*) >= 5 in WHERE would cause a syntax error since the count has not been computed at the WHERE stage. The GROUP BY includes d.id to handle departments that share a name.

Expected output

 department  | headcount | avg_salary
-------------+-----------+-----------
 Engineering  |        12 |  105833.33
 Product      |         7 |   92142.86
 Data Science |         5 |   88400.00

GROUP BY FAQ

What is the difference between WHERE and HAVING in SQL?+
WHERE filters individual rows before grouping. HAVING filters groups after aggregation. Use WHERE for conditions on raw column values (e.g., WHERE status = 'active'). Use HAVING for conditions on aggregate results (e.g., HAVING COUNT(*) > 10). You can use both in the same query: WHERE runs first, then GROUP BY, then HAVING.
Can I use column aliases in GROUP BY or HAVING?+
It depends on the database. MySQL and BigQuery allow column aliases in GROUP BY. PostgreSQL, Trino, and SQL Server do not. HAVING follows the same rules. For maximum portability, repeat the expression instead of using the alias. In interviews, repeating the expression is always safe.
How does GROUP BY handle NULL values?+
All NULL values are grouped together into a single group. This follows the SQL standard. If you GROUP BY a nullable column, you will get one row where that column is NULL, representing all rows with NULL values. Use COALESCE to replace NULLs with a default value before grouping if needed.
What is conditional aggregation?+
Conditional aggregation uses CASE WHEN inside an aggregate function to count or sum only rows meeting a condition. Example: SUM(CASE WHEN region = 'US' THEN revenue ELSE 0 END) AS us_revenue. This lets you pivot data and compute multiple filtered metrics in a single pass, avoiding multiple subqueries or self-joins.

Aggregate Under Interview Pressure

GROUP BY looks simple until you add HAVING, conditional aggregation, and multi-table joins. 32.7% of SQL interview rounds happen at the phone screen stage, where aggregation is the most common topic.