Back to questions
As a Data Analyst on Snowflake's Marketing Analytics team, your objective is to analyze customer relationship management (CRM) data and identify contacts that satisfy two conditions:
Marketing touches, also known as touch points, represent the interactions or points of contact between a brand and its customers.
Your goal is to generate a list of email addresses for these contacts.
Column Name | Type |
---|---|
event_id | integer |
contact_id | integer |
event_type | string ('webinar', 'conference_registration', 'trial_request') |
event_date | date |
event_id | contact_id | event_type | event_date |
---|---|---|---|
1 | 1 | webinar | 4/17/2022 |
2 | 1 | trial_request | 4/23/2022 |
3 | 1 | whitepaper_download | 4/30/2022 |
4 | 2 | handson_lab | 4/19/2022 |
5 | 2 | trial_request | 4/23/2022 |
6 | 2 | conference_registration | 4/24/2022 |
7 | 3 | whitepaper_download | 4/30/2022 |
8 | 4 | trial_request | 4/30/2022 |
9 | 4 | webinar | 5/14/2022 |
Column Name | Type |
---|---|
contact_id | integer |
string |
contact_id | |
---|---|
1 | andy.markus@att.net |
2 | rajan.bhatt@capitalone.com |
3 | lissa_rogers@jetblue.com |
4 | kevinliu@square.com |
andy.markus@att.net |
Among the contacts, only Contact ID 1 (andy.markus@att.net) satisfies both conditions specified in the problem. Contact ID 1 had a marketing touch with an event type of 'trial_request' and the marketing touch points occurred consecutively over a period of 3 weeks. This meets both conditions of having a marketing touch for three or more consecutive weeks.
On the other hand, Contact ID 2 (rajan.bhatt@capitalone.com) is not included in the generated list. Although they had a marketing touch with the event type 'trial_request', their touch points took place within the same week. Consequently, this does not meet the requirement of consecutive marketing touches for 3 weeks.
The dataset you are querying against may have different input & output - this is just an example!
To analyze the consecutive marketing touch points, we first need to convert the event dates to a week-on-week format.
This can be done using the function in PostgreSQL, which truncates the date to the specified part (in this case, the week). Read more about the function here.
The table below shows the complete data with the truncated field as for contact ID 1.
event_id | contact_id | event_type | event_date | event_week |
---|---|---|---|---|
1 | 1 | webinar | 04/17/2022 00:00:00 | 04/11/2022 00:00:00 |
2 | 1 | trial_request | 04/23/2022 00:00:00 | 04/18/2022 00:00:00 |
3 | 1 | whitepaper_download | 04/30/2022 00:00:00 | 04/25/2022 00:00:00 |
If you're unsure of how the week's date is being assigned, you can visualize it by referring to a calendar for April 2022. Take a look at the 04/17/2022. You'll notice that it belongs to the 04/11/2022, which represents the first day of that week.
To set up the conditions for verifying consecutive weeks, we can use the and window functions. The function retrieves the previous value, while the function retrieves the next value. Read more about the function here and the function here.
In our case, we apply these functions to the column.
The table below shows the results of the query with the additional columns and .
event_id | contact_id | event_type | lag_event_week | event_week | lead_event_week |
---|---|---|---|---|---|
1 | 1 | webinar | 04/11/2022 00:00:00 | 04/18/2022 00:00:00 | |
2 | 1 | trial_request | 04/11/2022 00:00:00 | 04/18/2022 00:00:00 | 04/25/2022 00:00:00 |
3 | 1 | whitepaper_download | 04/18/2022 00:00:00 | 04/25/2022 00:00:00 |
We understand that the output might appear complex, but don't worry, we'll simplify it for you.
Let's break down the interpretation of the output:
The key thing to remember is that the LAG function retrieves the previous value, while the LEAD function retrieves the next value, aligning with their respective namesakes.
To filter contacts that had marketing touch points for 3 or more consecutive weeks, we compare the current week with the previous and next weeks.
To determine if two events are consecutive, we can use the following formula: If (event_week - 1 week) = previous_event_week and (event_week + 1 week) = next_event_week, then the events are consecutive.
Let's refer to the table below for a clearer understanding.
event_id | contact_id | event_type | lag_event_week | event_week | lead_event_week |
---|---|---|---|---|---|
1 | 1 | webinar | 04/11/2022 00:00:00 | 04/18/2022 00:00:00 | |
2 | 1 | trial_request | 04/11/2022 00:00:00 | 04/18/2022 00:00:00 | 04/25/2022 00:00:00 |
3 | 1 | whitepaper_download | 04/18/2022 00:00:00 | 04/25/2022 00:00:00 |
Since we need to include only contacts that had at least one touch point of type 'trial_request', we add an additional filtering step. To achieve this, we use a subquery with the keyword to select the contact IDs that have 'trial_request' as the event type.
This query ensures that only contacts meeting the consecutive week condition and having at least one 'trial_request' event type are included.
To complete the task, we need to obtain the email addresses of the selected contacts. This requires joining the table with the table and selecting only the email field.
Incredible job – you made it!