10 Snap SQL Interview Questions

Updated on

December 21, 2023

At Snapchat, SQL does the heavy lifting for analyzing in-app behavior for product improvements and to help advertisers analyze their Return-on-ad-spend (ROAS) for Snap ad campaigns. Because of this, Snap LOVES to ask SQL coding questions during interviews for Data Science, Data Engineering and Data Analytics jobs. That's why we've curated 9 Snap SQL interview questions to practice, which are similar to recently asked questions at Snap.

Snapchat SQL Interview Questions

9 Snap SQL Interview Questions

SQL Question 1: Sending vs. Opening Snaps

Assume you're given tables with information on Snapchat users, including their ages and time spent sending and opening snaps.

Write a query to obtain a breakdown of the time spent sending vs. opening snaps as a percentage of total time spent on these activities grouped by age group. Round the percentage to 2 decimal places in the output.

Notes:

  • Calculate the following percentages:
    • time spent sending / (Time spent sending + Time spent opening)
    • Time spent opening / (Time spent sending + Time spent opening)
  • To avoid integer division in percentages, multiply by 100.0 and not 100.

Interactively explore the dataset, and write code directly in the browser to solve this Snapchat Join SQL question:

Snapchat JOIN SQL interview question

Table
Column NameType
activity_idinteger
user_idinteger
activity_typestring ('send', 'open', 'chat')
time_spentfloat
activity_datedatetime
Example Input
activity_iduser_idactivity_typetime_spentactivity_date
7274123open4.5006/22/2022 12:00:00
2425123send3.5006/22/2022 12:00:00
1413456send5.6706/23/2022 12:00:00
1414789chat11.0006/25/2022 12:00:00
2536456open3.0006/25/2022 12:00:00
Table
Column NameType
user_idinteger
age_bucketstring ('21-25', '26-30', '31-25')
Example Input
user_idage_bucket
12331-35
45626-30
78921-25
Example Output
age_bucketsend_percopen_perc
26-3065.4034.60
31-3543.7556.25
Example Output Explanation

Using the age bucket 26-30 as example, the time spent sending snaps was 5.67 and the time spent opening snaps was 3.

To calculate the percentage of time spent sending snaps, we divide the time spent sending snaps by the total time spent on sending and opening snaps, which is 5.67 + 3 = 8.67.

So, the percentage of time spent sending snaps is 5.67 / (5.67 + 3) = 65.4%, and the percentage of time spent opening snaps is 3 / (5.67 + 3) = 34.6%.

Answer:


To see a step-by-step explanation of this Snapchat problem, click here.

SQL Question 2: Identify the Top Active Snap Users

Snap (the parent company of Snapchat) wants to identify its VIP users - the users who are most active on the platform. They decide to define a VIP user as someone who sends more than 100 snaps per week. As an interviewee, your task is to write a SQL query that will generate a list of VIP users for the past 4 weeks.

Teenage Girl Taking a Selfie for Snapchat

Example Input:

snap_iduser_idsnap_date
1035672022-09-01
2065032022-09-02
3175672022-09-02
4017842022-09-03
5625672022-09-03
618782022-09-03
7255032022-09-04
8517842022-09-05
9985672022-09-05

Assume today's date is '2022-09-06'.

Answer:


This query starts by selecting the user_id and count of snaps from the snaps table for each user who sent a snap in the past 4 weeks. Then, it checks the count and filters out any users who sent less than or equal to 400 snaps in that time (since 4 weeks equals nearly four we are considering). The result is a list of only those users who sent more than 400 snaps in the past 4 weeks, essentially our VIP users according to Snap's definition.

In case this marketing problem was interesting, checkout how this blog on how marketing analysts use SQL.

SQL Question 3: Analyzing Daily Snap Views

Snap Inc. has a table, , that records the daily number of views each snap (an ephemeral photo or video capture) receives. This table's columns are (which identifies the snap), (the date the snap was viewed), and (the number of views that day).

Your task is to write an SQL query that determines the running total of views each snap has received, as well as the number of views each snap received in the past week (7 days).

Example Input:
snap_iddateviews
1012022-08-015000
1012022-08-024500
1012022-08-035500
1022022-08-016000
1022022-08-026500
Example Output:
snap_iddatetotal_views_to_dateweekly_views
1012022-08-0150005000
1012022-08-0295009500
1012022-08-031500015000
1022022-08-0160006000
1022022-08-021250012500

Answer:


This query first calculates the running total of the views for each snap using the function over a window that's defined by the and ordered by . The window function ensures that the sum is computed for each snap individually, and the clause ensures the sum is cumulative up to the current date.

The query then calculates the number of views each snap received in the past week. It uses a similar approach but specifies a different window frame: the current row and the 6 preceding rows. This results in a 7-day window of views, which is equivalent to the past week.

Note: This solution assumes that the table contains an entry for each day for each snap. If a snap doesn't receive any views on a particular day, and thus has no row for that day in the table, the calculation may include data from more than 7 days ago. You may need to preprocess the data to include zero-view rows for such days, or adjust the window frame criteria to suit your data.

p.s. Window functions show up super often during Snap SQL interviews, so practice the 27+ window function questions on DataLemur

DataLemur SQL Questions

SQL QUESTION 4: What does do?

combines the results from two or more SELECT statements, but only returns the rows that are exactly the same between the two sets.

For example, say you were doing an HR Analytics project for Snap, and had access to Snap's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables. You could use operator to find all employees who also show up in the contractors table:


SQL Question 5: Filtering active users on Snap

Snap wants to understand user engagement, specifically they want to identify users who have been active in the last 30 days (as of 2021-08-31), have more than 5 friends, and have sent at least 20 messages.

Example Input:
user_idlast_active_datefriend_countmessage_count
1232021-08-301030
2652021-08-01440
3622021-07-011050
1922021-08-01619
9812021-08-30620
Example Output:
user_id
123
981

Answer:

Here's the SQL query which selects users based on the conditions:


In this query, we have filtered out the users based on three conditions:

  • - This checks whether the user has been active in the last 30 days.
  • - This condition confirms that the user has more than 5 friends.
  • - This confirms that the user has sent at least 20 messages.

The users satisfying all three conditions are selected.

SQL QUESTION 6: What's the major difference between and ?

Before we address vs. question, let's address the clause which is used in a statement to group rows into a set of summary rows. It's typically used in conjunction with aggregate functions such as:

  • `MIN

The clause is used to filter the groups created by the clause. It's similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.

For example:


This query retrieves the total salary for each department and groups the rows by department. The clause then filters the groups to include only Snap departments where the total salary is greater than $1 million

Note that the clause must be used after the clause and can only reference columns listed in the clause or aggregated columns selected by the statement.

SQL Question 7: Click-Through-Rate (CTR) for Ads on Snap

To track the performance of Snapchat ad campaigns, it is vital to analyze the click-through-rate (CTR). The click-through-rate is calculated by dividing the number of users who clicked on an ad by the number of total impressions (times the ad was shown).

Given the following data about Snap's ads, can you calculate the click-through-rate for each campaign?

Sample Input:
camp_idcamp_name
1Summer Sale
2Winter Wonderland
3Back to School
Sample Input:
impression_iduser_idtimestampcamp_id
1001450106/15/2022 17:20:001
1012450206/16/2022 15:30:001
1023450306/17/2022 10:25:002
1034450406/18/2022 12:45:002
1045450506/19/2022 23:33:003
Sample Input:
click_iduser_idtimestampcamp_id
2011450106/15/2022 17:21:001
2022450206/17/2022 15:31:301
2033450306/17/2022 10:26:002

Answer:


This SQL query joins the table with the table and table, based on the campaign id. It then calculates the CTR as the total count of distinct click ids divided by the total count of distinct impression ids for each campaign. The function is used to convert the number of clicks to a floating point number to allow for decimal division.

p.s. if you happen to be a Machine Learning nerd, you might like this related article from Snap's engineering blog on how they do ML for Ad Rankings which includes Ad CTR as a ranking signal in terms of ad relevancy

Snapchat ML Ad Ranking Algorithm

SQL QUESTION 8: What's a primary key?

The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.

For example, say you had stored some Snapchat ad campaign data:


The CampaignID column is used to uniquely identify each row in the table, and the constraint ensures that there are no duplicate CampaignID values. This helps to maintain the accuracy of the data by preventing duplicate rows. The primary key is also an important part of the table because it enables you to easily identify and reference specific campaigns in your Snap Ad data. You can use it to join to other tables in the database, such as a table with data on the results of the campaigns.

SQL Question 9: Analyzing User Engagement on Snap

Snap, the parent company of Snapchat, has numerous users who engage with different activities on the app platform. They would like to have an analysis of user engagement in terms of the total number of hours spent on the app by month and by activity.

Consider the following data available in the table "user_activity":

Example Input:
activity_iduser_idactivity_dateactivity_namehours_spent
101101006/08/2022Videochat1.5
102205006/10/2022Snap Map2.0
103101006/18/2022Stories0.5
104415007/26/2022Snap Map2.5
105205007/05/2022Videochat3.0

The question is to form a SQL query that will provide the total hours spent on each activity per month.

Answer:


This SQL query will group the data by the month in which the activity was performed and by the name of the activity, summing the total number of hours spent on each activity. The result will provide an overview of which activities are most engaging for users within each month.

Example Output:
monactivity_namesum
6Videochat1.5
6Snap Map2.0
6Stories0.5
7Snap Map2.5
7Videochat3.0

Please note: The SQL syntax for extracting the month might be different depending on the SQL dialect used. The above SQL is based on PostgreSQL.

SQL Question 10: Join and Aggregate User and Friendship Data

As a data analyst at Snap, you are given access to two tables - and . The table contains information about the users such as , , and . The table, on the other hand, keeps track of all friendships, specifying the who sent the friend request, who received it and the of the friend request.

Write a SQL query to find all users who have signed up in the last 30 days and have sent at least one friend request which has been accepted.

Example Input:
user_iduser_namesign_up_datelast_log_in_date
1John Doe2022-09-01 00:00:002022-09-14 00:00:00
2Jane Doe2022-09-10 00:00:002022-09-15 00:00:00
3Harry Potter2022-08-28 00:00:002022-09-16 00:00:00
4Hermione Granger2022-08-20 00:00:002022-09-15 00:00:00
Example Input:
user_id1user_id2status
12Accepted
13Pending
34Accepted
24Rejected

Answer:


This SQL query does a join on the and table on the common . It then filters for recent users who have signed up in the last 30 days. From this result, it further filters for users with at least one friend request accepted. We are using function to calculate date 30 days ago from the current date and clause to filter grouped data.

Snap SQL Interview Tips

The key to acing a Snap SQL interview is to practice, practice, and then practice some more! Beyond just solving the above Snap SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Netflix, Airbnb, and Amazon.

DataLemur SQL Interview Questions

Each DataLemur SQL question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and crucially, there's an online SQL coding environment so you can easily right in the browser your SQL query answer and have it executed.

To prep for the Snap SQL interview you can also be useful to solve SQL questions from other consumer-tech/ ad-tech companies like:

However, if your SQL skills are weak, forget about diving straight into solving questions – refresh your SQL knowledge with this interactive SQL tutorial.

DataLemur SQL Tutorial for Data Science

This tutorial covers things like handling date/timestamp data and INTERCEPT/EXCEPT – both of these come up often in Snap SQL assessments.

Snap Data Science Interview Tips

What Do Snap Data Science Interviews Cover?

For the Snap Data Science Interview, in addition to SQL query questions, the other types of questions which are covered:

How To Prepare for Snap Data Science Interviews?

To prepare for Snap Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from FAANG (FB, Apple, Amazon, Netflix, Google)
  • a crash course covering Product Analytics, SQL & ML
  • over 900+ reviews on Amazon & 4.5-star rating

Acing Data Science Interview

© 2024 DataLemur, Inc

Career Resources

Free 9-Day Data Interview Crash CourseFree SQL Tutorial for Data AnalyticsSQL Interview Cheat Sheet PDFUltimate SQL Interview GuideAce the Data Job Hunt Video CourseAce the Data Science InterviewBest Books for Data Analysts