Use Sophia to knock out your gen-ed requirements quickly and affordably. Learn more
×

GROUP BY to Combine Data

Author: Sophia

what's covered
This tutorial explores the GROUP BY clause in a SELECT statement to divide rows into groups in three parts:
  1. Using the GROUP BY Clause
  2. Examples
  3. Multiple GROUP BY Quantities

1. Using the GROUP BY Clause

The GROUP BY clause in the SELECT statement can help to divide the rows that are being returned from the SELECT statement into specific groups. We can then apply an aggregate function to each group. So far, we have worked with aggregate functions that return a single aggregate value across the result set. Using the GROUP BY clause, we can calculate aggregates within a group of rows.

The structure of a statement that uses the GROUP BY clause looks like this:


SELECT <column1>…, <aggregate function>
FROM <tablename>
GROUP BY <column1>…;

We have the columns that we want to group in both the SELECT clause and the GROUP BY clause.

2. Examples

For example, if we wanted to know the number of customers that we have in each country, we can use the GROUP BY clause like this:


SELECT country, COUNT(*) 
FROM customer
GROUP BY country;

GROUP BY Clause Example

It is important that the column is listed in both the SELECT and GROUP BY clause. If we do not include it in the GROUP BY clause:


SELECT country, COUNT(*) 
FROM customer;

We will get an error:

Query Failure Message

If we did not have the column in the SELECT clause:


SELECT COUNT(*) 
FROM customer
GROUP BY country;

There will not be an error. But the result set isn’t useful at all, because it doesn’t indicate what the aggregate function is in relation to:

GROUP BY Clause Example

As you see in the result set above, even though we grouped based on country, we do not know which country each of the counts is for.

We can run a similar query to find the total amount of orders by country from the invoice table:


SELECT billing_country, SUM(total) 
FROM invoice
GROUP BY billing_country;

Total Amount of Orders Query Using the GROUP BY Clause

3. Multiple GROUP BY Quantities

We can also add more complexity by grouping by multiple columns and returning more than one aggregate value. For example, if we wanted to find the total and average of each invoice based on the state and country, we can do the following:


SELECT billing_country, billing_state, SUM(total), ROUND(AVG(total),2) 
FROM invoice
GROUP BY billing_country, billing_state
ORDER BY billing_country, billing_state;

Notice above, we have added the ORDER BY clause so that the results are sorted in an order that makes logical sense:

Adding ORDER BY to Query Example

The possibilities are endless for us to group the data and run aggregate functions on those subsets of data.

Video Transcript

try it
Your turn! Open the SQL tool by clicking on the LAUNCH DATABASE button below. Then enter in one of the examples above and see how it works. Next, try your own choices for which columns you want the query to provide.

summary
The GROUP BY clause allows us to divide rows into groups and then apply aggregate functions to each of the individual groups.

Source: Authored by Vincent Tran