Back to questions

User's Third Transaction Uber SQL Interview Question

User's Third Transaction

Uber SQL Interview Question

This is the same question as problem #11 in the SQL Chapter of Ace the Data Science Interview!

Assume you are given the table below on Uber transactions made by users. Write a query to obtain the third transaction of every user. Output the user id, spend and transaction date.

Table:

Column Name	Type
user_id	integer
spend	decimal
transaction_date	timestamp

Example Input:

user_id	spend	transaction_date
111	100.50	01/08/2022 12:00:00
111	55.00	01/10/2022 12:00:00
121	36.00	01/18/2022 12:00:00
145	24.99	01/26/2022 12:00:00
111	89.60	02/05/2022 12:00:00

Example Output:

user_id	spend	transaction_date
111	89.60	02/05/2022 12:00:00

The dataset you are querying against may have different input & output - this is just an example!

p.s. for more Uber SQL interview tips & problems, check out the Uber SQL Interview Guide

Solution

First, we obtain the order of transaction numbers for each user. We can do this by using the window function where we the all transactions by and the transaction_date.

Here's how the first 5 rows of output looks like:

user_id	spend	transaction_date	row_num
111	100.50	01/08/2022 12:00:00	1
111	55.00	01/10/2022 12:00:00	2
111	89.60	02/05/2022 12:00:00	3
121	36.00	01/18/2022 12:00:00	1
121	22.20	04/01/2022 12:00:00	2

From there on, we can simply convert the query into a subquery and filter for the users' third transaction which is their third transaction sorted by the transaction date (and is denoted as ).

Results:

user_id	spend	transaction_date
111	89.60	02/05/2022 12:00:00
121	67.90	04/03/2022 12:00:00

Apart from using subquery to solve this question, you can also use a CTE. Do you know the differences between a subquery and a CTE?

A CTE is a temporary data set to be used as part of a query and it exists during the entire query session. A subquery is a nested query. It’s a query within a query and unlike CTE, it can be used within that query only. Read here and here for more understanding.

Both methods give the same output and perform fairly similarly. Differences are CTE is reusable during the entire session and more readable, whereas subquery can be used in FROM and WHERE clauses and can act as a column with a single value. We share more resources here (1, 2, 3) on their use cases.

Solution #1: Using Subquery

Solution #2: Using CTE

Sourced from

Uber

Difficulty

Medium

Input

Output