Conditional Aggregation with CASE WHEN: Pivoting Logic Without Reshaping Your Data

Introduction

Imagine you're a data analyst at a retail company, and your manager asks for a single report showing total sales broken down by region — but instead of separate rows for each region, she wants East, West, and Central as their own columns, side by side, so she can compare them at a glance. Your data lives in a long, transactional table: one row per sale, with a region column that says "East," "West," or "Central." Your first instinct might be to export to Excel and pivot it there, or search for some exotic SQL function. But there's a cleaner way — and it lives entirely inside a standard SQL query.

Conditional aggregation is the technique of combining a CASE WHEN expression inside an aggregate function like SUM() or COUNT(). The result: you can create pivot-style column layouts, compute segmented metrics, and answer multi-part business questions in a single query — without restructuring your database, without exporting to a spreadsheet, and without using vendor-specific pivot syntax. Once you understand how this works, you'll reach for it constantly.

By the end of this lesson, you will be able to read and write conditional aggregation queries with confidence. You'll understand the mechanics from first principles, know the common traps to avoid, and have a set of realistic patterns you can adapt immediately.

What you'll learn:

How CASE WHEN expressions work as a filtering mechanism inside aggregations
How to use SUM(CASE WHEN ...) and COUNT(CASE WHEN ...) to segment data into columns
How to build pivot-style reports without PIVOT syntax or reshaping your data
How to handle NULLs properly so your numbers don't silently disappear
How to combine conditional aggregation with GROUP BY for multi-dimensional summaries

Prerequisites

You should be comfortable with:

Basic SELECT, FROM, WHERE queries
The concept of aggregate functions (SUM, COUNT, AVG, MAX, MIN)
GROUP BY clauses — you know how to group rows and summarize them

You do not need to know anything about CASE WHEN yet. We'll build that up from scratch.

Understanding CASE WHEN: SQL's Built-In If/Then Logic

Before we aggregate anything, let's understand the tool itself.

A CASE WHEN expression is SQL's way of writing conditional logic — essentially an if/then/else statement that you can embed directly in a query. The syntax looks like this:

CASE
    WHEN condition THEN result
    WHEN other_condition THEN other_result
    ELSE fallback_result
END

Let's make this concrete. Suppose you have a sales table with the following structure:

sale_id | rep_name   | region  | product    | amount
--------|------------|---------|------------|--------
1001    | Alice      | East    | Widget A   | 450.00
1002    | Bob        | West    | Widget B   | 320.00
1003    | Alice      | East    | Widget C   | 210.00
1004    | Carol      | Central | Widget A   | 580.00
1005    | Bob        | West    | Widget A   | 415.00
1006    | Carol      | Central | Widget B   | 130.00
1007    | Alice      | East    | Widget B   | 300.00

A simple CASE WHEN on this table could label each sale by size:

SELECT
    sale_id,
    amount,
    CASE
        WHEN amount >= 500 THEN 'Large'
        WHEN amount >= 300 THEN 'Medium'
        ELSE 'Small'
    END AS sale_size
FROM sales;

This returns a new column called sale_size computed row by row. The CASE WHEN fires once for each row, evaluates the conditions in order, and returns the first match.

That's useful on its own — but notice what happens when you nest this inside an aggregate function. That's where the real power unlocks.

The Core Idea: Filtering Inside an Aggregation

Here's the insight that makes conditional aggregation tick: aggregate functions ignore NULL values.

When SUM() encounters a NULL, it skips it. When COUNT() encounters a NULL, it skips it too. This isn't a quirk to work around — it's the mechanism we're going to exploit on purpose.

The trick is this: write a CASE WHEN inside a SUM() or COUNT() that returns the value you care about when the condition matches, and returns NULL when it doesn't. The aggregate function will naturally add up (or count) only the rows that matched.

Here's the pattern:

SUM(CASE WHEN condition THEN value ELSE NULL END)

Because ELSE NULL is the default when you omit the ELSE clause entirely, you can also write:

SUM(CASE WHEN condition THEN value END)

Both are equivalent. Let's use this on our sales table to get total sales by region — but as columns instead of rows.

Your First Pivot: Total Sales by Region as Columns

Without conditional aggregation, a standard GROUP BY gives you this:

SELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region;

Result:

region   | total_sales
---------|------------
East     | 960.00
West     | 735.00
Central  | 710.00

Three rows, one per region. That's fine, but your manager wanted columns. Here's the conditional aggregation version:

SELECT
    SUM(CASE WHEN region = 'East'    THEN amount END) AS east_sales,
    SUM(CASE WHEN region = 'West'    THEN amount END) AS west_sales,
    SUM(CASE WHEN region = 'Central' THEN amount END) AS central_sales
FROM sales;

Result:

east_sales | west_sales | central_sales
-----------|------------|---------------
960.00     | 735.00     | 710.00

One row, three columns. Let's trace through exactly what happens for east_sales:

For sale 1001 (East, $450): condition is true → returns 450
For sale 1002 (West, $320): condition is false → returns NULL
For sale 1003 (East, $210): condition is true → returns 210
...and so on

SUM() adds up: 450 + NULL + 210 + NULL + NULL + NULL + 300 = 960. The NULLs are silently skipped. The math works out exactly the same as filtering with WHERE region = 'East' — but you're doing all three regions simultaneously in a single pass over the data.

Why this matters for performance: A single scan of your table is almost always faster than running three separate filtered queries. On large datasets, this difference can be substantial.

Adding GROUP BY: Multi-Dimensional Summaries

The technique gets even more powerful when you combine it with GROUP BY. Now you can add a second dimension to your analysis.

Let's say you want sales by region and by product — still as a wide, column-oriented report:

SELECT
    product,
    SUM(CASE WHEN region = 'East'    THEN amount END) AS east_sales,
    SUM(CASE WHEN region = 'West'    THEN amount END) AS west_sales,
    SUM(CASE WHEN region = 'Central' THEN amount END) AS central_sales,
    SUM(amount)                                        AS total_sales
FROM sales
GROUP BY product
ORDER BY product;

Result:

product  | east_sales | west_sales | central_sales | total_sales
---------|------------|------------|---------------|------------
Widget A | 450.00     | 415.00     | 580.00        | 1445.00
Widget B | 300.00     | 320.00     | 130.00        | 750.00
Widget C | 210.00     | NULL       | NULL          | 210.00

Now you have a genuine pivot table: products as rows, regions as columns. Notice that Widget C has NULL for West and Central — that's because no Widget C sales occurred in those regions. This is honest: the data really isn't there.

If you'd prefer to show 0 instead of NULL (which is often better for readability), wrap the whole expression in COALESCE():

COALESCE(SUM(CASE WHEN region = 'East' THEN amount END), 0) AS east_sales

COALESCE() returns its first non-NULL argument. So if the SUM() produces NULL (because no rows matched), it returns 0 instead.

Tip: Use 0 for financial and quantity columns where "no data" genuinely means zero transactions. Leave NULL when you need to distinguish "no data" from "the value was actually zero." These are different things, and collapsing them with COALESCE discards information.

Using COUNT Instead of SUM

Not every question is about totaling a dollar amount. Sometimes you want to count how many things happened — and COUNT works the same way.

Let's count how many sales each rep made in each region:

SELECT
    rep_name,
    COUNT(CASE WHEN region = 'East'    THEN 1 END) AS east_count,
    COUNT(CASE WHEN region = 'West'    THEN 1 END) AS west_count,
    COUNT(CASE WHEN region = 'Central' THEN 1 END) AS central_count,
    COUNT(*)                                        AS total_count
FROM sales
GROUP BY rep_name
ORDER BY rep_name;

Result:

rep_name | east_count | west_count | central_count | total_count
---------|------------|------------|---------------|------------
Alice    | 3          | 0          | 0             | 3
Bob      | 0          | 2          | 0             | 2
Carol    | 0          | 0          | 2             | 2

When the condition matches, we return 1. When it doesn't, CASE WHEN returns NULL. COUNT() counts non-NULL values, so it only counts the matching rows.

Important distinction: COUNT(expression) counts non-NULL values. COUNT(*) counts all rows. When you write COUNT(CASE WHEN ... THEN 1 END), you're using COUNT(expression) — which gives you conditional counting. If you wrote COUNT(CASE WHEN ... THEN 1 ELSE 0 END), you'd accidentally count every single row because 0 is not NULL. This is one of the most common mistakes in conditional aggregation.

A More Realistic Pattern: Categorizing Customer Behavior

Let's move beyond a simple pivot and see how conditional aggregation handles a genuinely complex business question.

Suppose you work at a subscription company and you have an orders table:

order_id | customer_id | order_date  | status    | revenue
---------|-------------|-------------|-----------|--------
...

Your task: for each customer, show their total revenue, how much came from completed orders versus refunded orders, and how many orders they placed this year versus last year.

SELECT
    customer_id,
    COUNT(*)                                                        AS total_orders,
    SUM(revenue)                                                    AS total_revenue,
    SUM(CASE WHEN status = 'completed' THEN revenue END)            AS completed_revenue,
    SUM(CASE WHEN status = 'refunded'  THEN revenue END)            AS refunded_revenue,
    COUNT(CASE WHEN YEAR(order_date) = 2024 THEN 1 END)             AS orders_2024,
    COUNT(CASE WHEN YEAR(order_date) = 2023 THEN 1 END)             AS orders_2023,
    AVG(CASE WHEN status = 'completed' THEN revenue END)            AS avg_completed_order
FROM orders
GROUP BY customer_id
ORDER BY total_revenue DESC;

This query would take six separate filtered queries to replicate otherwise. Here, it's one. Every column is a different lens on the same dataset, computed in a single pass.

Notice AVG(CASE WHEN ...) — you can use conditional aggregation with any aggregate function, not just SUM and COUNT. AVG will compute the average only over rows where the condition matched, because it ignores NULLs in the denominator too.

Warning: AVG ignores NULLs entirely — both in the numerator and the denominator. AVG(CASE WHEN status = 'completed' THEN revenue END) is the average revenue among completed orders only, not the average across all orders with non-completed ones counted as zero. Make sure that's what you mean.

The NULL Trap: When Your Numbers Go Silent

The biggest source of bugs in conditional aggregation is mishandling NULL.

Consider this scenario: you're trying to count customers who are "active" (have placed an order in the last 30 days) versus "inactive":

-- WRONG: this counts every row for active, and every row for inactive
SELECT
    COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 ELSE 0 END) AS active_customers,
    COUNT(CASE WHEN days_since_last_order > 30  THEN 1 ELSE 0 END) AS inactive_customers
FROM customers;

Both columns will return the total number of rows because you used ELSE 0, and COUNT counts any non-NULL value including 0.

-- CORRECT: return NULL (not 0) for the non-matching case
SELECT
    COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 END) AS active_customers,
    COUNT(CASE WHEN days_since_last_order > 30  THEN 1 END) AS inactive_customers
FROM customers;

Or equivalently:

    COUNT(CASE WHEN days_since_last_order <= 30 THEN 1 ELSE NULL END) AS active_customers

Both work. The key is: don't put a non-NULL value in the ELSE branch of a COUNT.

For SUM, this matters less — ELSE 0 in a SUM is mathematically fine because adding zeros doesn't change the total. But it's still cleaner to omit the ELSE and let it default to NULL.

Hands-On Exercise

Work through this exercise using any SQL environment you have access to — PostgreSQL, MySQL, SQLite, or even an online playground like db-fiddle.com.

Setup: Create and populate this table:

CREATE TABLE support_tickets (
    ticket_id   INT,
    agent_name  VARCHAR(50),
    category    VARCHAR(50),
    priority    VARCHAR(20),
    resolved    BOOLEAN,
    handle_time INT  -- minutes to resolution
);

INSERT INTO support_tickets VALUES
(1,  'Dana',  'Billing',   'High',   true,  12),
(2,  'Dana',  'Technical', 'Low',    false, 45),
(3,  'Evan',  'Billing',   'Medium', true,  20),
(4,  'Evan',  'Technical', 'High',   true,  35),
(5,  'Dana',  'Technical', 'High',   true,  28),
(6,  'Fiona', 'Billing',   'Low',    true,  15),
(7,  'Fiona', 'Technical', 'Medium', false, 60),
(8,  'Evan',  'Billing',   'High',   true,  10),
(9,  'Dana',  'Billing',   'Medium', false, 50),
(10, 'Fiona', 'Technical', 'High',   true,  22);

Your tasks:

Write a query that shows, for each agent_name: total tickets handled, number of Billing tickets, number of Technical tickets, and overall resolution rate as a percentage (resolved tickets / total tickets * 100).
Extend that query to also show average handle time for High priority tickets vs. Low/Medium priority tickets separately.
Add a column that shows the count of unresolved High priority tickets — the ones your manager should worry about most.

Expected output structure (not values):

agent_name | total | billing | technical | resolve_pct | avg_high_time | avg_other_time | unresolved_high

Take your time. The first query is straightforward. The second and third require thinking carefully about which conditions to use and whether to use COUNT or SUM or AVG. Check your NULL handling.

Common Mistakes & Troubleshooting

Mistake 1: Using ELSE 0 inside COUNT As covered above, this makes COUNT count every row instead of just the matching ones. Only use ELSE 0 inside SUM. Inside COUNT, omit the ELSE or use ELSE NULL.

Mistake 2: Forgetting GROUP BY when you need it If you want per-agent or per-customer results, you need a GROUP BY. Without it, your conditional aggregations collapse to a single row (the grand total). This is correct behavior — just not what you usually want.

Mistake 3: Confusing NULL in the data with NULL from non-matching CASE WHEN If your source data already contains NULLs in the column you're aggregating, SUM(CASE WHEN ... THEN amount END) will skip those rows even when the condition matches (because the value being summed is NULL). Watch for this when dealing with nullable columns.

Mistake 4: Overlapping conditions producing double-counting If your CASE WHEN conditions aren't mutually exclusive, a single row can contribute to multiple columns. For example, if "High" and "Urgent" both match rows with priority = 'High', and a row could satisfy both, you'll count it twice. Make sure your segment definitions don't overlap.

Mistake 5: Wrong aggregate for the question SUM(CASE WHEN resolved = true THEN 1 END) and COUNT(CASE WHEN resolved = true THEN 1 END) both return the count of resolved tickets — but they look different, and people sometimes mix up which one to use when. For counting conditional rows, both work. For summing values conditionally, only SUM makes sense. Pick the one that matches your mental model and be consistent.

Summary & Next Steps

Conditional aggregation with CASE WHEN is one of the highest-leverage techniques in SQL. You've learned:

CASE WHEN is a row-level conditional expression that returns different values depending on which condition matches
Embedding CASE WHEN inside SUM(), COUNT(), or AVG() lets you selectively aggregate only the rows that match a condition
Aggregate functions ignore NULL, and that's the mechanism that makes conditional aggregation work
Combining this with GROUP BY creates genuine pivot-table-style reports entirely in SQL
COALESCE() wraps the outer result to replace NULL with zero when appropriate
Using ELSE 0 inside COUNT() is a common bug — omit the ELSE or use ELSE NULL instead

The pattern you've learned here scales from simple two-column pivots to deeply complex analytical queries across millions of rows. Every time you find yourself thinking "I need to filter, then aggregate, then combine three queries," conditional aggregation is probably the better path.

Where to go next:

Window Functions: The next major technique for analytical SQL. Window functions let you compute running totals, rankings, and lag/lead comparisons without GROUP BY collapsing your rows.
FILTER clause (PostgreSQL/SQLite): A cleaner syntax for conditional aggregation — SUM(amount) FILTER (WHERE region = 'East') — that does the same thing with less visual noise.
Pivot/Unpivot syntax: Now that you understand what pivoting means conceptually, explore your database's native PIVOT operator (available in SQL Server and some others) and understand when it's better or worse than the manual CASE WHEN approach.
Subqueries and CTEs: Learn how to use conditional aggregation as a building block inside larger, multi-step queries using Common Table Expressions.

Conditional Aggregation with CASE WHEN: Pivoting Logic Without Reshaping Your Data

Conditional Aggregation with CASE WHEN: Pivoting Logic Without Reshaping Your Data

Introduction

Prerequisites

Understanding CASE WHEN: SQL's Built-In If/Then Logic

The Core Idea: Filtering Inside an Aggregation

Your First Pivot: Total Sales by Region as Columns

Adding GROUP BY: Multi-Dimensional Summaries

Using COUNT Instead of SUM

A More Realistic Pattern: Categorizing Customer Behavior

The NULL Trap: When Your Numbers Go Silent

Hands-On Exercise

Common Mistakes & Troubleshooting

Summary & Next Steps

Related Articles

Mastering SQL Set Operations: UNION, INTERSECT, and EXCEPT for Complex Data Reconciliation and Deduplication

Temporal Data Mastery: Writing Queries for Time-Series, Date Ranges, and Slowly Changing Dimensions

Subqueries and Correlated Subqueries: Writing Queries Within Queries

Related Articles

SQL🔥 Expert
Mastering SQL Set Operations: UNION, INTERSECT, and EXCEPT for Complex Data Reconciliation and Deduplication
28 min

SQL⚡ Practitioner
Temporal Data Mastery: Writing Queries for Time-Series, Date Ranges, and Slowly Changing Dimensions
21 min

SQL🌱 Foundation
Subqueries and Correlated Subqueries: Writing Queries Within Queries
15 min