SQL GROUP BY Tutorial With Examples & Practice Exercises

The SQL aggregate functions like , , and from the previous lesson process an entire column to come up with an answer. But what if you want to aggregate only part of a table?

For example, instead of finding the average price all Amazon products orders, what if you want to find the average spent by category (i.e. average spent on appliances vs. electronics?).

For Excel users, the"pivot table" comes to mind, and the SQL equivalent to that concept is the command which tells the database to separate your data into groups, which can then be aggregated independently.

Here's a example that groups Amazon product sales by category:

The output of that query yields this result:

category	sum
electronics	1007.54
appliance	1135.22

Try playing around with yourself by copy-pasting the earlier code and running it against the data in this Amazon SQL interview question:

GROUP BY vs. Excel Pivot Table

Both in SQL and Pivot Tables in Excel are tools designed to aggregate and summarize data. To demonstrate their similarities, let's first create a Pivot Table for a data set of FAANG stocks.

This pivot table finds the average open price for each stock, and the max close price for each stock: Pivot Table in Google Sheets Stock Price Example

You can notice that the "rows" dialog box in the spreadsheet tool is essentially asking you what column you want to group by ("ticker"), and the values box is asking you what column values you want to display ("open", "close"), and how they'll be aggregated (average, max).

This is the equivalent to the SQL query:

Notice how the SQL query output matches up to the pivot table!

GROUP BY Two Columns

You can GROUP BY two columns, even multiple columns, in SQL. To group by multiple categories, just separate column names with commas (just like the syntax in the !).

For an example of grouping by multiple columns, here's the average stock open price grouped by each ticker symbol and group by each year:

Here's a sample of that output:

ticker	year	avg_open
NFLX	2023	364.91
META	2023	220.35
AMZN	2023	109.27
MSFT	2023	291.96
AAPL	2023	171.21
GOOG	2023	108.10
MSFT	2022	276.78
AAPL	2022	151.56
META	2022	193.06
...	...	...

And again, just like with syntax for multiple columns, for shorthand, you can use numbers instead of typing out the full column names, to make your query more concise. Here's the exact same query in the format:

GROUP BY Practice Exercise #1

Given FAANG stock prices data, can you write a SQL query which uses GROUP BY to find the lowest price each stock ever opened at? Order your results by price, in descending order. Your output should look like this:

ticker	min
NFLX	176.49
MSFT	153.00
META	94.33
AMZN	85.46
AAPL	61.63
GOOG	56.10

Click the below to practice this GROUP BY exercise question in the DataLemur interactive SQL editor:

Can GROUP BY be used to remove duplicates?

While isn't exactly used for the purpose of finding duplicates, the command does allow you to collapse multiple rows with the same values into a single row. So, in a way, it could be considered as a way to get rid of duplicates for certain columns.

To demonstrate this, imagine you were analyzing this website traffic data from a Google Analytics report:

browser	visits
Chrome	6
Safari	2
Safari	4
Edge	3
Safari	5
Chrome	3
Chrome	4

To get rid of duplicate browser data, and aggregate the visits information, you could use the following GROUP BY query:

This would eliminate the duplicate browser information, and output the following result:

browser	visits
Chrome	13
Safari	11
Edge	3

GROUP BY Practice Exercise #2

Suppose you are given a table of Data Science candidates, and their technical skills:

Sample Input:

candidate_id	skill
123	Python
234	R
234	Python
234	SQL Server
345	Python
...	...

How many candidates possess each of the different skills?

Sort your answers based on the count of candidates, from highest to lowest. Here's what the expected output looks like:

skill	count
Python	3
R	1
SQL Server	1

Click below to solve this GROUP BY practice question in the DataLemur interactive SQL editor:

What's the difference between GROUP BY and ORDER BY?

People sometimes get confused between and because both commands have in them, but their actual function is quite different!

ORDER BY helps you output your rows in a specific order, such as alphabetically on some text column, or from smallest to biggest, for some text column.

, as you saw earlier, is all about grouping your data into categories! Because they are quite different commands, it's absolutely possible for a query to have both GROUP BY and ORDER BY, with GROUP BY coming first!

Here's a query that uses both GROUP BY and ORDER BY:

It takes the FAANG stock data, and finds the average stock open price for each ticker symbol. Then, the clause orders the average stock open price, from highest to lowest: GROUP BY vs. ORDER BY SQL Query Example

Next Lesson: GROUP BY with HAVING

Your now ready for the next lesson on HAVING which is 99.9% of the time used in conjunction with the GROUP BY clause. Curious about this dynamic duo?

Click on over to the next tutorial!

Next Lesson

SQL HAVING 🤏

SQL Tutorial