This is the same question as problem #9 in the SQL Chapter of Ace the Data Science Interview!
Assume you are given the table below on user transactions. Write a query to obtain the list of customers whose first transaction was valued at $50 or more. Output the number of users.
Explanation: Only user 156 has a first transaction valued over $50.
To get the ordering of the customer purchases, we can use the window function to get the ordering of customer purchases.
Then, we can use the common table expression (CTE) or subquery to filter customers whose first purchase (denoted as ) and is valued at 50 dollars or more (denoted as ). Note that this would require the CTE or subquery to include spend also.
CTE vs. Subquery
A CTE is a temporary data set to be used as part of a query and it exists during the entire query session. A subquery is a nested query. It’s a query within a query and unlike CTE, it can be used within that query only. Read here and here for more understanding.
Both methods give the same output and perform fairly similarly. Differences are CTE is reusable during the entire session and more readable, whereas subquery can be used in FROM and WHERE clauses and can act as a column with a single value. We share more resources here (1, 2, 3 on their use cases.
Below are two methods of solving the question using CTE and subquery.
Solution 1: Using CTE
Solution 2: Using subquery