Back to questions
Given the reviews table, write a query to retrieve the average star rating for each product, grouped by month. The output should display the month as a numerical value, product ID, and average star rating rounded to two decimal places. Sort the output first by month and then by product ID.
P.S. If you've read the Ace the Data Science Interview, and liked it, consider writing us a review?
Column Name | Type |
---|---|
review_id | integer |
user_id | integer |
submit_date | datetime |
product_id | integer |
stars | integer (1-5) |
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 06/08/2022 00:00:00 | 50001 | 4 |
7802 | 265 | 06/10/2022 00:00:00 | 69852 | 4 |
5293 | 362 | 06/18/2022 00:00:00 | 50001 | 3 |
6352 | 192 | 07/26/2022 00:00:00 | 69852 | 3 |
4517 | 981 | 07/05/2022 00:00:00 | 69852 | 2 |
mth | product | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
Product 50001 received two ratings of 4 and 3 in the month of June (6th month), resulting in an average star rating of 3.5.
The dataset you are querying against may have different input & output - this is just an example!
Here's some more Amazon SQL Interview Questions:
As observed, the table does not have a separate column for month. Therefore, we need to extract the month from the column using the function, which returns the month in numerical format.
Refer to this tutorial for more explanation on the function.
To calculate the average star ratings per month for each product, we can use the aggregate function to calculate the mean of the stars column and the function to round the result to two decimal places for accuracy.
The query would be as follows:
In SQL, the order of execution is important to understand. In the given solution's query, the sequence of execution is as follows:
It's important to note that the clause is executed before the statement. Therefore, we cannot use the mth alias in the clause, as the mth column is created after the statement is executed. However, we can use the mth alias in the ORDER BY clause, as it is executed after the statement, and the mth column has been created by then.
Understanding the order of SQL execution is crucial, as it is a common topic in technical interviews. It's recommended to familiarize yourself with the sequence of execution in SQL for better query writing and debugging.