2nd Ride Delay
As a data analyst at Uber, it's your job to report the latest metrics for specific groups of Uber users. Some riders create their Uber account on the same day that they book their first ride; the rider engagement team calls them "in-the-moment" users.
Uber wants to know the average delay between the day these users signed up and the day of their 2nd ride. Write a query to pull the average 2nd ride delay for "in-the-moment" Uber users.
Tip You don't need to use date operations to get the answer! You're welcome to, but it's not necessary.
Step 1 Before we calculate any numbers, let's first identify users who can be classified as "in-the-moment"– These are the users who create their account on the same day as their first ride. We can join the table with on the common identifier ; we'll only pull results where the registration date and ride date are equal.
Since there might be users who had more than 1 ride on the first date, simply use to obtain unique users. In addition, wrap the query into a CTE, so we can reuse the output later.
The query will look something like this:
Step 2 The question asks for the average delay between the 1st and 2nd ride. That means we need to identify a trip number for each user's rides. To do so, let's create a new column with that enumerates trips per each user.
Now, we need to set this up so that we can compare each ride date to the previous ride date using row-wise subtraction. To accomplish this, we'll use the function to pull the date of the previous trip in a new column.
You can create a lagged ride date column by using the function on ride date column while partitioning by users:
|1||08/16/2022 00:00:00||2||08/15/2022 00:00:00|
|4||05/17/2022 00:00:00||2||05/16/2022 00:00:00|
|4||05/18/2022 00:00:00||3||05/17/2022 00:00:00|
Step 3 We're almost there – all that's left to do is the row-wise subtraction and the average calculation! Simply subtract the from to obtain the delay for each ride. Then wrap this subtraction in an function to obtain average of all ride delays.
Keep in mind that the task asked for the 2nd ride delay, so make sure to filter on results where the is 2. Also, don't forget to round the result to 2 decimals ;)