Back to questions
For this checkpoint, we want to investigate products that previously had low reorders, but currently have high reorders. Feel free to approach this by measuring counts, percentages, or both.
Write a query to find products previously reordered fewer than 10 times, and currently reordered 10 or more times. Alternatively, write a query to find which products had the biggest and smallest percent changes in products ordered.
Don't worry if your submitted answer is marked wrong – this is merely an open-ended exercise!
Here are the schemas for all 5 tables in the Instacart market data.
These files specify which products were purchased in each Instacart order. ic_order_products_prior contains previous order contents for all customers, and ic_order_products_curr contains current orders; the table fields are the same.
The 'reordered' field indicates that the customer has a previous order that contains the product. Other fields should be self-explanatory.
Column Name | Type |
---|---|
order_id | integer |
product_id | integer |
add_to_cart_order | integer |
reordered | integer boolean (1 or 0) |
Notes
Column Name | Type |
---|---|
product_id | integer |
product_name | string |
aisle_id | integer |
department_id | integer |
Column Name | Type |
---|---|
department_id | integer |
department | string |
Column Name | Type |
---|---|
aisle_id | integer |
aisle | string |
This query joins the prior and current orders tables to the product, department, and aisles tables. The crux of the query lies in the SUMs, which allow us to compare product reorders across the prior and current orders tables. We select a lot of other fields so that we can use them to aggregate later as needed.
Finally, in the HAVING clause, we filter on products that were previously reordered fewer than 10 times, and are currently reordered 10 or more times. 10 is a nice round number to start with but a different number (or a measure of percent change) would work for this methodology too.