At Chewy, SQL is used quite frequently for analyzing pet-related sales trends and optimizing inventory management based on online orders. Because of this, Chewy LOVES to ask SQL query questions during interviews for Data Science and Analytics jobs at Chewy.
As such, to help you ace the Chewy SQL interview, this blog covers 8 Chewy SQL interview questions – how many can you solve?
Provided we have two tables - which represents all orders placed at Chewy and which has information about users. Power users are defined as the users who have placed more than 10 orders and have spent over $500 in the past month.
order_id | user_id | order_date | total_amount |
---|---|---|---|
6485 | 101 | 08/10/2022 | 40 |
8927 | 202 | 08/15/2022 | 100 |
7389 | 101 | 08/16/2022 | 60 |
5903 | 303 | 08/18/2022 | 25 |
7392 | 202 | 08/20/2022 | 100 |
7839 | 101 | 08/21/2022 | 45 |
7648 | 303 | 08/23/2022 | 30 |
9012 | 101 | 08/25/2022 | 50 |
8120 | 202 | 08/26/2022 | 120 |
9480 | 101 | 08/27/2022 | 50 |
9901 | 101 | 08/29/2022 | 60 |
4720 | 303 | 08/31/2022 | 25 |
user_id | user_name | registration_date |
---|---|---|
101 | John | 01/05/2018 |
202 | Lisa | 20/07/2020 |
303 | Bill | 15/03/2021 |
We want to find the power users at Chewy, therefore we want to identify users who have placed more than 10 orders and have spent over $500 in the past month.
In this query, we join the and table on . We then filter for orders placed in the last month. We group by and so we can get counts and sums per user. Lastly using the HAVING clause, we filter for users with more than 10 orders and total spent greater than $500, defining our power users.
To solve a related super-user data analysis question on DataLemur's free interactive coding environment, try this recently asked Microsoft SQL interview question:
You are a data analyst at Chewy, an American online retailer of pet food and other pet-related products. You have access to a table that logs every review submitted to the website. Each row in this table represents a unique review and has the following columns:
Your task is to write a SQL query that computes the average star rating for each product per month. The result should include the month in which the reviews were submitted, the product_id, and the average star rating for that product in that month.
We follow a convention that months are represented as numbers: 1 for January, 2 for February, etc.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 50001 | 4 |
7802 | 265 | 2022-06-10 | 69852 | 4 |
5293 | 362 | 2022-06-18 | 50001 | 3 |
6352 | 192 | 2022-07-26 | 69852 | 3 |
4517 | 981 | 2022-07-05 | 69852 | 2 |
mth | product | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
In this query, the function is used to get the month part of the date of each review. Then, for each product in each month, we compute the average of the stars awarded in the reviews using the function. The results are displayed in ascending order of month and product_id.
To practice a related window function SQL problem on DataLemur's free online SQL code editor, try this Google SQL Interview Question:
Think of SQL constraints like the rules of a game. Just like a game needs rules to keep things fair and fun, a database needs constraints to keep things organized and accurate.
There are several types of SQL constraints like:
NOT NULL: This constraint is like a bouncer at a nightclub - it won't let anything NULL through the door. UNIQUE: This constraint is like a VIP list - only special, one-of-a-kind values get in. PRIMARY KEY: This constraint is like an elected official - it's made up of NOT NULL and UNIQUE values and helps identify each row in the table. FOREIGN KEY: This constraint is like a diplomatic ambassador - it helps establish relationships between tables. CHECK: This constraint is like a referee - it makes sure everything follows the rules. DEFAULT: This constraint is like a backup plan - it provides a default value if no other value is specified.
So, whether you're playing a game or organizing a database, constraints are an important part of the process!
The company Chewy, a pet supply e-commerce business, wants to calculate the average monthly spend per customer. They have a table storing all transactions made by each customer and a table storing all product prices. The table includes columns for , , , , and . The table includes columns for , , and .
Design a SQL query that calculates the average monthly spend per customer. Assume all prices are in USD.
transaction_id | customer_id | product_id | transaction_date | quantity |
---|---|---|---|---|
1 | 100 | 300 | 01/05/2022 | 2 |
2 | 200 | 200 | 02/10/2022 | 1 |
3 | 100 | 100 | 03/15/2022 | 1 |
4 | 300 | 100 | 04/20/2022 | 2 |
5 | 200 | 400 | 05/25/2022 | 1 |
product_id | product_name | product_price |
---|---|---|
100 | Dog Food | $20.00 |
200 | Cat Food | $15.00 |
300 | Bird Seed | $5.00 |
400 | Fish Food | $10.00 |
This query first joins the and tables on the column. It then groups the data by and the month of the , and calculates the average spend in each group by multiplying the of each transaction with the corresponding .
An inner join only includes rows from both tables that satisfy the join condition, whereas a full outer join includes all rows from both tables, regardless of whether they match the join condition or not.
For example, suppose you had a table of Chewy orders and Chewy customers.
INNER JOIN: When there is a match in the shared key or keys, rows from both tables are retrieved. In this example, an between the Orders and Customers tables would retrieve rows where the in the Orders table matches the in the Customers table.
FULL OUTER JOIN: A retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.
As a data analyst at Chewy, a leading pet supply eCommerce business, your task is to analyze the reviews given by the customers for each product every month. Write a SQL query that will help you determine the average rating of each product for each month.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2021-06-08 | 50001 | 4 |
7802 | 265 | 2021-06-10 | 69852 | 4 |
5293 | 362 | 2021-06-18 | 50001 | 3 |
6352 | 192 | 2021-07-26 | 69852 | 3 |
4517 | 981 | 2021-07-05 | 69852 | 2 |
The SQL query above uses the GROUP BY clause along with the AVG aggregate function to get the average rating of each product per month. The EXTRACT function is used to get the month from the submit_date column. The results are then grouped by the month and the product_id. The result is a list of months, product_ids, and their respective average ratings, ordered by month and product_id.
A non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query. On the other hand, a correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query.
Here is an example of a non-correlated sub-query:
The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.
Here is an example of a correlated sub-query:
This query selects the and total sales of all Chewy customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().
Non-correlated sub-queries are typically faster, as they only have to be executed once, wheareas correlated sub-queries are slower since they have to be re-evaluated for each row of the outer query.
Given a list of reviews for products, and a list of products with their weight, write a SQL query that can calculate the average rating of each product, and adjust it by taking the square root of the product weight and rounding it to the nearest whole integer.
For the sake of this problem, let's assume that higher weight indicates higher quality, so the weight-adjusted rating is calculated as such: .
Consider the following data:
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
1 | 123 | 2022-05-01 | 101 | 4 |
2 | 124 | 2022-05-02 | 102 | 3 |
3 | 125 | 2022-05-02 | 102 | 5 |
4 | 126 | 2022-05-03 | 103 | 2 |
5 | 127 | 2022-05-03 | 103 | 3 |
product_id | weight |
---|---|
101 | 4 |
102 | 9 |
103 | 16 |
Here is the PostgreSQL query that would answer this question:
This query first performs an inner join on the and tables using the . Then, it calculates the average rating per and adds the square root of the product's weight (rounded to the nearest whole integer). The result is grouped by product ID and weight. This ensures that each product’s average rating is calculated separately, then adjusted by its appropriate weight factor.
To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for averaging product ratings or this Alibaba Compressed Mean Question which is similar for rounding calculations.
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. In addition to solving the earlier Chewy SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Google, Microsoft and Silicon Valley startups.
Each problem on DataLemur has hints to guide you, full answers and crucially, there is an online SQL code editor so you can instantly run your query and have it checked.
To prep for the Chewy SQL interview you can also be helpful to practice interview questions from other tech companies like:
In case your SQL coding skills are weak, forget about jumping right into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers topics including grouping by multiple columns and rank window functions – both of these come up routinely during Chewy SQL interviews.
In addition to SQL query questions, the other topics to prepare for the Chewy Data Science Interview are:
To prepare for Chewy Data Science interviews read the book Ace the Data Science Interview because it's got: