Back to questions
This is the same question as problem #1 in the SQL Chapter of Ace the Data Science Interview!
Assume you have an events table on Facebook app analytics. Write a query to calculate the click-through rate (CTR) for the app in 2022 and round the results to 2 decimal places.
Definition and note:
Column Name | Type |
---|---|
app_id | integer |
event_type | string |
timestamp | datetime |
app_id | event_type | timestamp |
---|---|---|
123 | impression | 07/18/2022 11:36:12 |
123 | impression | 07/18/2022 11:37:12 |
123 | click | 07/18/2022 11:37:42 |
234 | impression | 07/18/2022 14:15:12 |
234 | click | 07/18/2022 14:16:12 |
app_id | ctr |
---|---|
123 | 50.00 |
234 | 100.00 |
Let's consider an example of App 123. This app has a click-through rate (CTR) of 50.00% because out of the 2 impressions it received, it got 1 click.
To calculate the CTR, we divide the number of clicks by the number of impressions, and then multiply the result by 100.0 to express it as a percentage. In this case, 1 divided by 2 equals 0.5, and when multiplied by 100.0, it becomes 50.00%. So, the CTR of App 123 is 50.00%.
The dataset you are querying against may have different input & output - this is just an example!
First, we filter for analytics events from the year 2022 using the clause with appropriate comparison operators:
Next, find the number of clicks and impressions using the statement to assign a value of 1 for 'click' events and 0 for other events:
Here's the first 5 rows of output:
app_id | clicks | impressions |
---|---|---|
123 | 0 | 1 |
123 | 0 | 1 |
123 | 1 | 0 |
234 | 0 | 1 |
234 | 1 | 0 |
Then, we add up the clicks and impressions by wrapping the statements with a aggregate function and group the results by .
app_id | clicks | impressions |
---|---|---|
123 | 2 | 3 |
234 | 1 | 3 |
Finally, calculate the percentage of click-through rate (CTR) by dividing the number of clicks by the number of impressions and multiplying by 100.0, rounded to 2 decimal places using the function.
Percentage of click-through rate = 100.0 * Number of clicks / Number of impressions
Solution #2: Using COUNT(CASE ...)
Solution #3: Using SUM() FILTER ()