In the previous lesson, you learned how to use the GROUP BY clause to aggregate stats from the FAANG stocks dataset, like find the average price for each FAANG stock.
But suppose it's not enough to know the aggregated stats for each FAANG stock. Instead, say we wanted to display just the stocks whose average open price was greater than $200 per share.
In your head, you may think that the clause would help. Sadly, in this scenario, you'll get hit with the error message "aggregate functions are not allowed in WHERE".
But, have no fear, the HAVING clause can come to our rescue! In SQL, allows you to filter data based on values from aggregate functions.
Here's an example query that finds all FAANG stocks with an average share open price of more than $200:
The above query would yield the following result:
The difference between WHERE vs. HAVING is a common conceptual SQL interview question, so we figured we'd cover it a bit more explicitly: WHERE filters on values in individual rows, versus HAVING filters values aggregated from groups of rows.
Here's a summary table on the difference between WHERE & HAVING:
|When It Filters||Values BEFORE Grouping||Values AFTER Grouping|
|Operates On Data From||Individual Rows||Aggregated Values from Groups of Rows|
|Example||SELECT username, followers FROM instagram_data WHERE followers > 1000;||SELECT country FROM instagram_data GROUP BY country HAVING AVG(followers) > 100;|
Try using the clause in a SQL query to output only the FAANG stocks whose minimum open share price is greater than $100. Your output should look something like this:
In SQL, can be used to filter aggregated data using multiple conditions, just in the same way can filter data with multiple conditions.
Here's an example that finds all FAANG stocks where the average open price is greater than $200 per share, and the minimum price the stock opened at is greater than 100 per share:
The above multi-column query will yield the following result:
Given a table of candidates and their technical skills, write a SQL query that uses to list only the candidate IDs of candidates who have more than 2 technical skills.
Here's the sample input data:
For the above sample data, we'd only output candidate_id 123 because that's the only person with more than two technical skills (candidate 123 knows Python, Tableau, and PostgreSQL).
Make sure you solved the above exercise correctly, because it's an easier version of a real LinkedIn SQL interview question that we'll tackle together in the next section.
To challenge yourself with , try solving this real LinkedIn SQL interview problem asked in a real LinkedIn Product Data Science job interview.
Hint: to solve this, besides using the and commands, you'll also want to use the SQL operator to filter down the input candidates dataset to just those candidates that possess Python, Tableau, PostgreSQL skills.
It's technically possible to have without in SQL, and in this case operates the same as . However, practically speaking, it's super weird to use in this context. For the most part, you should just use a clause if trying to filter on non-aggregated values.
It's CRUCIAL to write clauses in the correct order, otherwise, your SQL query won't run!
Here's the order for the commands you've learned so far:
More details can be found in our lesson on the SQL Order of Execution.
I know you've been HAVING fun learning about , but I have a strong, DISTINCT feeling you'll enjoy the next SQL tutorial on DISTINCT.
SQL DISTINCT ⭐