Back to questions
Assume you're given tables with information about TikTok user sign-ups and confirmations through email and text. New users on TikTok sign up using their email addresses, and upon sign-up, each user receives a text message confirmation to activate their account.
Write a query to display the user IDs of those who did not confirm their sign-up on the first day, but confirmed on the second day.
Definition:
Column Name | Type |
---|---|
email_id | integer |
user_id | integer |
signup_date | datetime |
email_id | user_id | signup_date |
---|---|---|
125 | 7771 | 06/14/2022 00:00:00 |
433 | 1052 | 07/09/2022 00:00:00 |
Column Name | Type |
---|---|
text_id | integer |
email_id | integer |
signup_action | string ('Confirmed', 'Not confirmed') |
action_date | datetime |
text_id | email_id | signup_action | action_date |
---|---|---|---|
6878 | 125 | Confirmed | 06/14/2022 00:00:00 |
6997 | 433 | Not Confirmed | 07/09/2022 00:00:00 |
7000 | 433 | Confirmed | 07/10/2022 00:00:00 |
user_id |
---|
1052 |
Only User 1052 confirmed their sign-up on the second day.
The dataset you are querying against may have different input & output - this is just an example!
Credits to Deepak K @ Linkedin for sharing this easy and concise solution that fulfills two conditions:
To begin, we join the and tables on the matching field. Feel free to skip this step if you wish as our intention is to clarify the definition of condition no. 1 for you.
Output with selected rows:
email_id | user_id | signup_date | text_id | email_id | signup_action | action_date |
---|---|---|---|---|---|---|
433 | 1052 | 07/09/2022 00:00:00 | 6997 | 433 | Not confirmed | 07/09/2022 00:00:00 |
433 | 1052 | 07/09/2022 00:00:00 | 7000 | 433 | Confirmed | 07/10/2022 00:00:00 |
236 | 6950 | 07/01/2022 00:00:00 | 9841 | 236 | Confirmed | 07/01/2022 00:00:00 |
450 | 8963 | 08/02/2022 00:00:00 | 6800 | 450 | Not confirmed | 08/03/2022 00:00:00 |
555 | 8963 | 08/09/2022 00:00:00 | 1255 | 555 | Not confirmed | 08/09/2022 00:00:00 |
555 | 8963 | 08/09/2022 00:00:00 | 2660 | 555 | Not confirmed | 08/10/2022 00:00:00 |
555 | 8963 | 08/09/2022 00:00:00 | 2800 | 555 | Confirmed | 08/11/2022 00:00:00 |
Next, we interpret the output together:
Now that you understand how to fulfill these conditions, let's incorporate them into the solution.
Condition #1: Users who confirmed on the second day
The condition in the clause means we only want users who confirmed on the second day after their signup, as reflected in the field. We achieve this by taking and adding an interval of 1 day. Refer to this tutorial for a refresher on .
email_id | user_id | signup_date | text_id | signup_action | action_date |
---|---|---|---|---|---|
433 | 1052 | 07/09/2022 00:00:00 | 7000 | Confirmed | 07/10/2022 00:00:00 |
450 | 8963 | 08/02/2022 00:00:00 | 6800 | Not confirmed | 08/03/2022 00:00:00 |
555 | 8963 | 08/09/2022 00:00:00 | 2660 | Not confirmed | 08/10/2022 00:00:00 |
741 | 1235 | 07/25/2022 00:00:00 | 1568 | Confirmed | 07/26/2022 00:00:00 |
As you can see, the is 1 day after the , fulfilling the first condition. Now let's move on to the second condition.
Condition #2: The texts received must say 'Confirmed'
In addition to the first condition, we add the condition ' in the WHERE clause to ensure that the texts received must say 'Confirmed'.
email_id | user_id | signup_date | text_id | signup_action | action_date |
---|---|---|---|---|---|
433 | 1052 | 07/09/2022 00:00:00 | 7000 | Confirmed | 07/10/2022 00:00:00 |
741 | 1235 | 07/25/2022 00:00:00 | 1568 | Confirmed | 07/26/2022 00:00:00 |