Back to questions

Instacart Reorders – Case Study Checkpoint #2 SQL Interview Question

Instacart Reorders – Case Study Checkpoint #2

SQL Interview Question

For this checkpoint, we want to investigate products that previously had low reorders, but currently have high reorders. Feel free to approach this by measuring counts, percentages, or both.

Write a query to find products previously reordered fewer than 10 times, and currently reordered 10 or more times. Alternatively, write a query to find which products had the biggest and smallest percent changes in products ordered.

Don't worry if your submitted answer is marked wrong – this is merely an open-ended exercise!

Table Schema

Here are the schemas for all 5 tables in the Instacart market data.

ic_order_products_curr, ic_order_products_prior

These files specify which products were purchased in each Instacart order. ic_order_products_prior contains previous order contents for all customers, and ic_order_products_curr contains current orders; the table fields are the same.

The 'reordered' field indicates that the customer has a previous order that contains the product. Other fields should be self-explanatory.

Column Name	Type
order_id	integer
product_id	integer
add_to_cart_order	integer
reordered	integer boolean (1 or 0)

Notes

Some orders will have no reordered items
None of these fields are unique to this table, but the combination of order_id and product is unique

ic_products

Column Name	Type
product_id	integer
product_name	string
aisle_id	integer
department_id	integer

ic_departments

Column Name	Type
department_id	integer
department	string

ic_aisles

Column Name	Type
aisle_id	integer
aisle	string

Data Source

Solution

This query joins the prior and current orders tables to the product, department, and aisles tables. The crux of the query lies in the SUMs, which allow us to compare product reorders across the prior and current orders tables. We select a lot of other fields so that we can use them to aggregate later as needed.

Finally, in the HAVING clause, we filter on products that were previously reordered fewer than 10 times, and are currently reordered 10 or more times. 10 is a nice round number to start with but a different number (or a measure of percent change) would work for this methodology too.

Difficulty

Medium

Input

Output