Using the HAVING clause to filter aggregate columns

Published: {ts '2014-09-02 00:00:00'}
Author: Steven Neiland
Site Url: http://www.neiland.net/article/using-the-having-clause-to-filter-aggregate-columns/

Sometimes I am amazed at how little some developers know when it comes to writing SQL. With the advent of ORM I suspect the problem is only going to get worse but I digress.

Take for example this problem I came across in a real application recently:
(I have simplified the code for clarity)

Problem: Aggregate and filter by sales totals

Example: List all departments along with their sales totals that have done more than $1000 in sales excluding the software development department.

In looking at the source code I saw that the developer had tried all the classic approaches.

Attempt 1: Fail

Obviously this does not work


SELECT 
	departmentName
	, SUM(amount) AS totalSales
FROM 
	sales
WHERE
	departmentName != 'development'
	AND totalSales > 1000
GROUP BY 
	departmentName

Attempt 2: Fail

Likewise this did not work


SELECT 
	departmentName
	, SUM(amount) AS totalSales
FROM 
	sales
WHERE
	departmentName != 'development'
	AND SUM(amount) > 1000
GROUP BY 
	departmentName

Attempt 3: Success

Finally they hit on this solution that worked and left it at that.


SELECT
	x1.*
FROM (
	SELECT 
		departmentName
		, SUM(amount) AS totalSales
	FROM 
		sales
	WHERE
		departmentName != 'development'
	GROUP BY 
		departmentName
) AS x1
WHERE
	x1.totalSales > 1000

While this does indeed work I really dislike wrapping the SELECT in another one just to filter on an aggregate column. Fortunately there is a better way.

Using HAVING

The better way to achieve this is to use the "HAVING" clause in place of the "WHERE" clause for filtering aggregate columns (You still use the "WHERE" clause to filter non aggregates). Use by placing after the "GROUP BY" clause like this:


SELECT 
	departmentName
	, SUM(amount) AS "Total sales"
FROM 
	sales
WHERE
	departmentName != 'development'
GROUP BY 
	departmentName
HAVING 
	SUM(amount) > 1000

In terms of performance there was no measurable difference between the two on this database but the code is immediately more readable and maintainable.