Most of the queries that we have been working on so far have been focused on one row at a time and looking at individual records. However, the database works using sets of data rather than individual records. There are a lot of questions that can be answered with a database by aggregating the data together as if it were a single unit. This type of processing is done using aggregate functions. Although different databases may have some specialized aggregate functions, there are many common ones that will be available in most databases. These aggregate functions take a collection of rows and reduce the resulting set to one row.
SQL can perform various mathematical summaries for us within the database to help answer questions. For example, how many invoices were there in January? Of course, we could count as we have done in the past, but sometimes counting is not viable to do. One might want to know what was the smallest invoice total in the prior year or the maximum invoice total of all time? Or perhaps the average quantity ordered across all invoices.
Aggregate functions can help us answer these questions. Here is a list of the most common aggregate functions. We will get into more specifics on each in other tutorials.
The AVG function calculates the average of non-null values.
As an example, if we were looking for the average of the total items purchased across all invoices, we could run:
The COUNT function returns the number of rows in a group including those that have NULL values.
If we wanted to find out how many customers lived in the country USA, we would run:
WHERE country = 'USA';
The MAX function returns the largest of the non-null values.
If we wanted to find the largest order made between January 1, 2009 and January 1, 2010, we would run:
WHERE invoice_date between '2009-01-01' AND '2010-01-01';
The MIN function returns the smallest of the non-null values.
If we wanted to find the smallest total prior to January 1, 2011, we would run:
WHERE invoice_date < '2011-01-01';
The SUM function returns the summation of all of the non-null values.<be></be>
If we wanted to find out the number tracks ordered for all time, we would run:
In PostgreSQL, there are other unique aggregate functions that can be used including:
|BOOL_AND||This returns true if all of the input values are true, otherwise, it returns false.|
|BOOL_OR||This returns true if one of the input values is true, otherwise, it returns false.|
|STDDEV||This function returns the standard deviation based on the non-null values.|
|VARIANCE||This function returns the variance of the non-null values.|
|RANK||This function returns the rank of the row based on the value. If a row has the same value as the prior row, it will return the same rank.|
There are many others that can be used, but this will be a good starting point to build from.