User Session Activity
This is the same question as problem #24 in the SQL Chapter of Ace the Data Science Interview!
Assume you are given the table below containing information on user session activity. Write a query that ranks users according to their total session durations (in minutes) by descending order for each session type between the start date (2022-01-01) and end date (2022-02-01). Output the user id, session type, and the ranking of the total session duration.
|session_type||string ("like", "reply", "retweet")|
|duration||integer (in minutes)|
Explanation: User 333 is listed on the top due to the highest duration of 15 minutes. The ranking resets on 3rd row as the session type changes.
First, we can perform a CTE or subquery to obtain the total session duration by user and session type between the start and end dates. Then, we can use RANK to obtain the rank, making sure to partition by session type and then order by duration as in the query below:
Below are two methods of solving the question using CTE and subquery.
CTE is a temporary data set to be used as part of a query and it exists during the entire query session. A subquery is a nested query. It’s a query within a query and unlike CTE, it can be used within that query only.
Both methods give the same output and perform fairly similarly. Differences are CTE is reusable during the entire session and more readable, whereas subquery can be used in FROM and WHERE clauses and can act as a column with a single value.
Solution 1: Using CTE
Solution 2: Using Subquery