Repeat Purchases on Multiple Days
This is the same question as problem #7 in the SQL Chapter of Ace the Data Science Interview!
Assume you are given the table below containing information on user purchases. Write a query to obtain the number of users who purchased the same product on two or more different days. Output the number of unique users.
We can’t simply perform a count since, by definition, the purchases must have been made on different days (and for the same products).
To address this issue, we use the window function while partitioning by user id and product id and then sort the result by the field (not time), using the function to convert the to a type. We are sorting by date and not timestamp so that we do not count multiple purchases of the same product on the same day.
From this inner subquery, we then obtain the distinct count of the user ids for which purchase number is 2 (denoted as ). Note that we don’t need count above 2 (denoted as ) since any purchase number above 2 denotes multiple products.
Solution #1: Using Subquery
Solution #2: Using CTE
Solution #3: Using Self-Join