Imagine you're a data analyst at a retail company, and your manager asks for a single report showing total sales broken down by region — but instead of separate rows for each region, she wants East, West, and Central as their own columns, side by side, so she can compare them at a glance. Your data lives in a long, transactional table: one row per sale, with a region column that says "East," "West," or "Central." Your first instinct might be to export to Excel and pivot it there, or search for some exotic SQL function. But there's a cleaner way — and it lives entirely inside a standard SQL query.
Conditional aggregation is the technique of combining a CASE WHEN expression inside an aggregate function like SUM() or COUNT(). The result: you can create pivot-style column layouts, compute segmented metrics, and answer multi-part business questions in a single query — without restructuring your database, without exporting to a spreadsheet, and without using vendor-specific pivot syntax. Once you understand how this works, you'll reach for it constantly.
By the end of this lesson, you will be able to read and write conditional aggregation queries with confidence. You'll understand the mechanics from first principles, know the common traps to avoid, and have a set of realistic patterns you can adapt immediately.
What you'll learn:
CASE WHEN expressions work as a filtering mechanism inside aggregationsSUM(CASE WHEN ...) and COUNT(CASE WHEN ...) to segment data into columnsPIVOT syntax or reshaping your dataGROUP BY for multi-dimensional summariesYou should be comfortable with:
SELECT, FROM, WHERE queriesSUM, COUNT, AVG, MAX, MIN)GROUP BY clauses — you know how to group rows and summarize themYou do not need to know anything about CASE WHEN yet. We'll build that up from scratch.
Before we aggregate anything, let's understand the tool itself.
A CASE WHEN expression is SQL's way of writing conditional logic — essentially an if/then/else statement that you can embed directly in a query. The syntax looks like this:
CASE
WHEN condition THEN result
WHEN other_condition THEN other_result
ELSE fallback_result
END
Let's make this concrete. Suppose you have a sales table with the following structure:
sale_id | rep_name | region | product | amount
--------|------------|---------|------------|--------
1001 | Alice | East | Widget A | 450.00
1002 | Bob | West | Widget B | 320.00
1003 | Alice | East | Widget C | 210.00
1004 | Carol | Central | Widget A | 580.00
1005 | Bob | West | Widget A | 415.00
1006 | Carol | Central | Widget B | 130.00
1007 | Alice | East | Widget B | 300.00
A simple CASE WHEN on this table could label each sale by size:
SELECT
sale_id,
amount,
CASE
WHEN amount >= 500 THEN 'Large'
WHEN amount >= 300 THEN 'Medium'
ELSE 'Small'
END AS sale_size
FROM sales;
This returns a new column called sale_size computed row by row. The CASE WHEN fires once for each row, evaluates the conditions in order, and returns the first match.
That's useful on its own — but notice what happens when you nest this inside an aggregate function. That's where the real power unlocks.
Here's the insight that makes conditional aggregation tick: aggregate functions ignore NULL values.
When SUM() encounters a NULL, it skips it. When COUNT() encounters a NULL, it skips it too. This isn't a quirk to work around — it's the mechanism we're going to exploit on purpose.
The trick is this: write a CASE WHEN inside a SUM() or COUNT() that returns the value you care about when the condition matches, and returns NULL when it doesn't. The aggregate function will naturally add up (or count) only the rows that matched.
Here's the pattern:
SUM(CASE WHEN condition THEN value ELSE NULL END)
Because ELSE NULL is the default when you omit the ELSE clause entirely, you can also write:
SUM(CASE WHEN condition THEN value END)
Both are equivalent. Let's use this on our sales table to get total sales by region — but as columns instead of rows.
Without conditional aggregation, a standard GROUP BY gives you this:
SELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region;
Result:
region | total_sales
---------|------------
East | 960.00
West | 735.00
Central | 710.00
Three rows, one per region. That's fine, but your manager wanted columns. Here's the conditional aggregation version:
SELECT
SUM(CASE WHEN region = 'East' THEN amount END) AS east_sales,
SUM(CASE WHEN region = 'West' THEN amount END) AS west_sales,
SUM(CASE WHEN region = 'Central' THEN amount END) AS central_sales
FROM sales;
Result:
east_sales | west_sales | central_sales
-----------|------------|---------------
960.00 | 735.00 | 710.00
One row, three columns. Let's trace through exactly what happens for east_sales:
SUM() adds up: 450 + NULL + 210 + NULL + NULL + NULL + 300 = 960. The NULLs are silently skipped. The math works out exactly the same as filtering with WHERE region = 'East' — but you're doing all three regions simultaneously in a single pass over the data.
Why this matters for performance: A single scan of your table is almost always faster than running three separate filtered queries. On large datasets, this difference can be substantial.
The technique gets even more powerful when you combine it with GROUP BY. Now you can add a second dimension to your analysis.
Let's say you want sales by region and by product — still as a wide, column-oriented report:
SELECT
product,
SUM(CASE WHEN region = 'East' THEN amount END) AS east_sales,
SUM(CASE WHEN region = 'West' THEN amount END) AS west_sales,
SUM(CASE WHEN region = 'Central' THEN amount END) AS central_sales,
SUM(amount) AS total_sales
FROM sales
GROUP BY product
ORDER BY product;
Result:
product | east_sales | west_sales | central_sales | total_sales
---------|------------|------------|---------------|------------
Widget A | 450.00 | 415.00 | 580.00 | 1445.00
Widget B | 300.00 | 320.00 | 130.00 | 750.00
Widget C | 210.00 | NULL | NULL | 210.00
Now you have a genuine pivot table: products as rows, regions as columns. Notice that Widget C has NULL for West and Central — that's because no Widget C sales occurred in those regions. This is honest: the data really isn't there.
If you'd prefer to show 0 instead of NULL (which is often better for readability), wrap the whole expression in COALESCE():
COALESCE(SUM(CASE WHEN region = 'East' THEN amount END), 0) AS east_sales
COALESCE() returns its first non-NULL argument. So if the SUM() produces NULL (because no rows matched), it returns 0 instead.
Tip: Use
0for financial and quantity columns where "no data" genuinely means zero transactions. LeaveNULLwhen you need to distinguish "no data" from "the value was actually zero." These are different things, and collapsing them withCOALESCEdiscards information.
Not every question is about totaling a dollar amount. Sometimes you want to count how many things happened — and COUNT works the same way.
Let's count how many sales each rep made in each region:
SELECT
rep_name,
COUNT(CASE WHEN region = 'East' THEN 1 END) AS east_count,
COUNT(CASE WHEN region = 'West' THEN 1 END) AS west_count,
COUNT(CASE WHEN region = 'Central' THEN 1 END) AS central_count,
COUNT(*) AS total_count
FROM sales
GROUP BY rep_name
ORDER BY rep_name;
Result:
rep_name | east_count | west_count | central_count | total_count
---------|------------|------------|---------------|------------
Alice | 3 | 0 | 0 | 3
Bob | 0 | 2 | 0 | 2
Carol | 0 | 0 | 2 | 2
When the condition matches, we return 1. When it doesn't, CASE WHEN returns NULL. COUNT() counts non-NULL values, so it only counts the matching rows.
Important distinction:
COUNT(expression)counts non-NULL values.COUNT(*)counts all rows. When you writeCOUNT(CASE WHEN ... THEN 1 END), you're usingCOUNT(expression)— which gives you conditional counting. If you wroteCOUNT(CASE WHEN ... THEN 1 ELSE 0 END), you'd accidentally count every single row because0is not NULL. This is one of the most common mistakes in conditional aggregation.
Let's move beyond a simple pivot and see how conditional aggregation handles a genuinely complex business question.
Suppose you work at a subscription company and you have an orders table:
order_id | customer_id | order_date | status | revenue
---------|-------------|-------------|-----------|--------
...
Your task: for each customer, show their total revenue, how much came from completed orders versus refunded orders, and how many orders they placed this year versus last year.
SELECT
customer_id,
COUNT(*) AS total_orders,
SUM(revenue) AS total_revenue,
SUM(CASE WHEN status = 'completed' THEN revenue END) AS completed_revenue,
SUM(CASE WHEN status = 'refunded' THEN revenue END) AS refunded_revenue,
COUNT(CASE WHEN YEAR(order_date) = 2024 THEN 1 END) AS orders_2024,
COUNT(CASE WHEN YEAR(order_date) = 2023 THEN 1 END) AS orders_2023,
AVG(CASE WHEN status = 'completed' THEN revenue END) AS avg_completed_order
FROM orders
GROUP BY customer_id
ORDER BY total_revenue DESC;
This query would take six separate filtered queries to replicate otherwise. Here, it's one. Every column is a different lens on the same dataset, computed in a single pass.
Notice AVG(CASE WHEN ...) — you can use conditional aggregation with any aggregate function, not just SUM and COUNT. AVG will compute the average only over rows where the condition matched, because it ignores NULLs in the denominator too.
Warning:
AVGignores NULLs entirely — both in the numerator and the denominator.AVG(CASE WHEN status = 'completed' THEN revenue END)is the average revenue among completed orders only, not the average across all orders with non-completed ones counted as zero. Make sure that's what you mean.
The biggest source of bugs in conditional aggregation is mishandling NULL.
Consider this scenario: you're trying to count customers who are "active" (have placed an order in the last 30 days) versus "inactive":
-- WRONG: this counts every row for active, and every row for inactive
SELECT
COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 ELSE 0 END) AS active_customers,
COUNT(CASE WHEN days_since_last_order > 30 THEN 1 ELSE 0 END) AS inactive_customers
FROM customers;
Both columns will return the total number of rows because you used ELSE 0, and COUNT counts any non-NULL value including 0.
-- CORRECT: return NULL (not 0) for the non-matching case
SELECT
COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 END) AS active_customers,
COUNT(CASE WHEN days_since_last_order > 30 THEN 1 END) AS inactive_customers
FROM customers;
Or equivalently:
COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 ELSE NULL END) AS active_customers
Both work. The key is: don't put a non-NULL value in the ELSE branch of a COUNT.
For SUM, this matters less — ELSE 0 in a SUM is mathematically fine because adding zeros doesn't change the total. But it's still cleaner to omit the ELSE and let it default to NULL.
Work through this exercise using any SQL environment you have access to — PostgreSQL, MySQL, SQLite, or even an online playground like db-fiddle.com.
Setup: Create and populate this table:
CREATE TABLE support_tickets (
ticket_id INT,
agent_name VARCHAR(50),
category VARCHAR(50),
priority VARCHAR(20),
resolved BOOLEAN,
handle_time INT -- minutes to resolution
);
INSERT INTO support_tickets VALUES
(1, 'Dana', 'Billing', 'High', true, 12),
(2, 'Dana', 'Technical', 'Low', false, 45),
(3, 'Evan', 'Billing', 'Medium', true, 20),
(4, 'Evan', 'Technical', 'High', true, 35),
(5, 'Dana', 'Technical', 'High', true, 28),
(6, 'Fiona', 'Billing', 'Low', true, 15),
(7, 'Fiona', 'Technical', 'Medium', false, 60),
(8, 'Evan', 'Billing', 'High', true, 10),
(9, 'Dana', 'Billing', 'Medium', false, 50),
(10, 'Fiona', 'Technical', 'High', true, 22);
Your tasks:
Write a query that shows, for each agent_name: total tickets handled, number of Billing tickets, number of Technical tickets, and overall resolution rate as a percentage (resolved tickets / total tickets * 100).
Extend that query to also show average handle time for High priority tickets vs. Low/Medium priority tickets separately.
Add a column that shows the count of unresolved High priority tickets — the ones your manager should worry about most.
Expected output structure (not values):
agent_name | total | billing | technical | resolve_pct | avg_high_time | avg_other_time | unresolved_high
Take your time. The first query is straightforward. The second and third require thinking carefully about which conditions to use and whether to use COUNT or SUM or AVG. Check your NULL handling.
Mistake 1: Using ELSE 0 inside COUNT
As covered above, this makes COUNT count every row instead of just the matching ones. Only use ELSE 0 inside SUM. Inside COUNT, omit the ELSE or use ELSE NULL.
Mistake 2: Forgetting GROUP BY when you need it
If you want per-agent or per-customer results, you need a GROUP BY. Without it, your conditional aggregations collapse to a single row (the grand total). This is correct behavior — just not what you usually want.
Mistake 3: Confusing NULL in the data with NULL from non-matching CASE WHEN
If your source data already contains NULLs in the column you're aggregating, SUM(CASE WHEN ... THEN amount END) will skip those rows even when the condition matches (because the value being summed is NULL). Watch for this when dealing with nullable columns.
Mistake 4: Overlapping conditions producing double-counting If your CASE WHEN conditions aren't mutually exclusive, a single row can contribute to multiple columns. For example, if "High" and "Urgent" both match rows with priority = 'High', and a row could satisfy both, you'll count it twice. Make sure your segment definitions don't overlap.
Mistake 5: Wrong aggregate for the question
SUM(CASE WHEN resolved = true THEN 1 END) and COUNT(CASE WHEN resolved = true THEN 1 END) both return the count of resolved tickets — but they look different, and people sometimes mix up which one to use when. For counting conditional rows, both work. For summing values conditionally, only SUM makes sense. Pick the one that matches your mental model and be consistent.
Conditional aggregation with CASE WHEN is one of the highest-leverage techniques in SQL. You've learned:
CASE WHEN is a row-level conditional expression that returns different values depending on which condition matchesCASE WHEN inside SUM(), COUNT(), or AVG() lets you selectively aggregate only the rows that match a conditionGROUP BY creates genuine pivot-table-style reports entirely in SQLCOALESCE() wraps the outer result to replace NULL with zero when appropriateELSE 0 inside COUNT() is a common bug — omit the ELSE or use ELSE NULL insteadThe pattern you've learned here scales from simple two-column pivots to deeply complex analytical queries across millions of rows. Every time you find yourself thinking "I need to filter, then aggregate, then combine three queries," conditional aggregation is probably the better path.
Where to go next:
SUM(amount) FILTER (WHERE region = 'East') — that does the same thing with less visual noise.CASE WHEN approach.Learning Path: Advanced SQL Queries