Back to questions
Write an SQL query to find the best-selling product in each product category. If there are two or more products with the same sales quantity, go by whichever product which has the higher review rating.
Return the category name and product name in alphabetical order of the category.
Column Name | Type |
---|---|
product_id | integer |
product_name | varchar |
category_name | varchar |
product_id | product_name | category_name |
---|---|---|
3690 | Game of Thrones | Books |
5520 | Refrigerator | Home Appliances |
5952 | Dishwasher | Home Appliances |
3561 | IKGAI | Books |
Column Name | Type |
---|---|
product_id | integer |
sales_quantity | integer |
rating | decimal (1.0 - 5.0) |
product_id | sales_quantity | rating |
---|---|---|
3690 | 300 | 4.9 |
5520 | 70 | 3.8 |
5952 | 70 | 4.0 |
3561 | 290 | 4.5 |
category_name | product_name |
---|---|
Books | Game of Thrones |
Home Appliances | Dishwasher |
The dataset you are querying against may have different input & output - this is just an example!
First, we join the products table on the corresponding product id with the product_sales table to operate on category name, product name, rating and sales quantity fields together.
The best-selling product is the product with the highest quantity of units sold. So, we rank the products in each category by their sales quantity. This can be achieved by the usage of window function.
Read here and here to learn more about and window function, respectively.
You will get the (3 rows in random order) of output below:
product_rank | category_name | product_name | sales_quantity | rating |
---|---|---|---|---|
1 | Home Appliances | Refrigerator | 70 | 3.8 |
1 | Home Appliances | Dishwasher | 70 | 4.0 |
2 | Home Appliances | Washing Machine | 30 | 3.4 |
If you look at the output, you will notice products Refrigerator and Dishwasher are both ranked first in Home Appliances category. It's because the same quantity of units (70 units) for both products were sold.
In such cases, we must obtain the product with the higher rating, according to the question guidelines. It can be done by adding a rating column in the clause and specify to get the higher rating first.
The output will look like this, and you will be able to see what we just talked about.
product_rank | category_name | product_name | sales_quantity | rating |
---|---|---|---|---|
1 | Home Appliances | Dishwasher | 70 | 4.0 |
2 | Home Appliances | Refrigerator | 70 | 3.8 |
3 | Home Appliances | Washing Machine | 30 | 3.4 |
Then, we will convert the query into a subquery and get the products with first ranking in the clause. At the end, add an clause to sort the result based on category name in alphabetical order.
Results:
category_name | product_name |
---|---|
Amazon Kindle | Kindle Oasis |
Books | Ace the Data Science Interview |
Home Appliances | Dishwasher |
Tech Gadgets | Gaming Keyboard |
Solution #1: Using Subquery
Solution #2: Using CTE