logo

10 TikTok SQL Interview Questions

Updated on

January 22, 2024

At TikTok (owned by parent company ByteDance), SQL is used all the damn time for analyzing user engagement data to make the app more addictive, and for studying the efficacy of it's video content recommendations algorithm. Unsurprisingly, this is why TikTok almost always evaluates jobseekers on SQL query questions during interviews for Data Science and Data Engineering positions.

So, if you're studying for a SQL Interview, we've collected 10 TikTok SQL interview questions to practice, which are similar to recently asked questions at TikTok – can you solve them?

TikTok SQL Interview Questions

10 TikTok SQL Interview Questions

SQL Question 1: TikTok Sign-Up Activation Rate

Assume new TikTok users sign up with their emails. They confirmed their signup by replying to the text confirmation to activate their accounts. Users may receive multiple text messages for account confirmation until they have confirmed their new account.

A senior analyst is interested to know the activation rate of specified users in the table. Write a SQL query to find the activation rate percentage, and round your answer to 2 decimal places.

Definitions:

  • table contain the information of user signup details.
  • table contains the users' activation information.

Assumptions:

  • The analyst is interested in the activation rate of specific users in the table, which may not include all users that could potentially be found in the table.
  • For example, user 123 in the table may not be in the table and vice versa.

To explore the given dataset, and write code online to solve the problem and have your solution graded, click here:

TikTok SQL Interview Question: Sign-up Activation Rate

Table:
Column NameType
email_idinteger
user_idinteger
signup_datedatetime
Example Input:
email_iduser_idsignup_date
125777106/14/2022 00:00:00
236695007/01/2022 00:00:00
433105207/09/2022 00:00:00
Table:
Column NameType
text_idinteger
email_idinteger
signup_actionvarchar
Example Input:
text_idemail_idsignup_action
6878125Confirmed
6920236Not Confirmed
6994236Confirmed

'Confirmed' in means the user has activated their account and successfully completed the signup process.

Example Output:
confirm_rate
0.67

Example Output Explanation: 67% of users have successfully completed their signup and activated their accounts. The remaining 33% have not yet replied to the text to confirm their signup.

Answer:


SQL Question 2: Identify TikTok's Most Active Users

As a Data Analyst for TikTok, you've been asked to identify the users who are the most active on the platform. "Activity" in this context is defined by the number of videos a user uploads.

A "power user" is someone who has uploaded more than 1000 videos. Write a SQL query to list all of the power users, sorted by the number of videos they have posted in descending order.

TikTok Power User

The company has two tables, and :

Example Input:
user_idusernamesignup_date
1user101/01/2020
2user202/02/2020
3user305/05/2020
4user412/12/2020
Example Input:
video_iduser_idupload_date
1001101/02/2020
1002101/03/2020
1003202/03/2020
1004303/03/2020
1005404/04/2020
1006405/04/2020
1007405/04/2020
1008405/04/2020
1009306/04/2020
1010207/07/2020

Answer:


This SQL query first performs an inner join on and tables using as the joining column. It then groups by to get the total count of (which represents the number of videos posted by each user). The clause filters out any users who have uploaded less than 1000 videos. Finally, it orders the remaining users by in decreasing order, giving the 'power users' at TikTok who have uploaded the most videos.

SQL QUESTION 3: Can you give an example of a one-to-one relationship between two entities, vs. a one-to-many relationship?

In database schema design, a one-to-one relationship is when each entity is associated with only one instance of the other. For instance, a US citizen's relationship with their social-security number (SSN) is one-to-one because each citizen can only have one SSN, and each SSN belongs to one person.

A one-to-many relationship, on the other hand, is when one entity can be associated with multiple instances of the other entity. An example of this is the relationship between a person and their email addresses - one person can have multiple email addresses, but each email address only belongs to one person.

SQL Question 4: Second Day Confirmation

New users on TikTok sign up using their email addresses, and upon sign-up, each user receives a text message confirmation to activate their account. Write a SQL query to output the user IDs of those who did not confirm their sign-up on the first day, but confirmed on the second day.

To practice this problem interactively, and explore the input data yourself, visit the 2nd Day Confirmation Problem on DataLemur:

TikTok SQL Interview Question: 2nd Day Confirmation

Table:
Column NameType
email_idinteger
user_idinteger
signup_datedatetime
Example Input:
email_iduser_idsignup_date
125777106/14/2022 00:00:00
433105207/09/2022 00:00:00
Table:
Column NameType
text_idinteger
email_idinteger
signup_actionstring ('Confirmed', 'Not confirmed')
action_datedatetime

Hint: refers to the date when users activated their accounts and confirmed their sign-up through text messages.

Example Input:
text_idemail_idsignup_actionaction_date
6878125Confirmed06/14/2022 00:00:00
6997433Not Confirmed07/09/2022 00:00:00
7000433Confirmed07/10/2022 00:00:00
Example Output:
user_id
1052

Only User 1052 confirmed their sign-up on the second day.

Answer:


For a step-by-step solution to this problem, go here.

SQL QUESTION 5: What's a database view?

A database view is a virtual table that is created based on the results of a SELECT statement, and provides you a customized, read-only version of your data that you can query just like a regular table.

You'd want to use a view for a few reasons:

  • views allow you to create a simpler versions of your data based on whose running the query (such as hiding un-important columns/rows from business analysts if they're just random noisy artifricats of your Data Infrastructure pipelines)
  • views can help you comply with information security requirements by hiding sensitive data from certain users (important for regulated industries like govermnet and healthcare!)
  • views often improve performance for complicated queries by pre-computing the results and saving them in a view (which is often faster than re-executing the original query)... just be careful since static views don't update if the underlying data changes!

SQL Question 6: Calculate Average Video Duration on TikTok

On TikTok, each user can post several videos. For business decisions, it's often necessary to know the average duration of these videos to better understand the user engagement. For instance, if the average video duration is short, it could indicate that users on the platform prefer shorter, more concise content. Alternatively, longer average video lengths could infer that users enjoy or are more engaged with longer-form content. Calculate the average video duration for each TikTok user using the provided database tables.

Provided below is a snapshot of your 'users' table and 'videos' table:

Example Input:
user_idusernamesignup_date
101user106/01/2020
102user206/03/2020
103user306/05/2020
Example Input:
video_iduser_idupload_datevideo_length_seconds
20110106/08/202260
20210106/10/2022120
20310206/18/202290
20410307/26/2022100
20510307/05/2022120

Answer:


This script first joins the 'videos' table and 'users' table on 'user_id', then calculates the average video length for each user. The AVG() function is used to calculate the average video length. GROUP BY is used to group the results by each unique user.

To practice a very similar question try this interactive Twitter Tweets' Rolling Averages Question which is similar for calculating averages based on user content or this TikTok Signup Activation Rate Question which is similar for deriving metrics from TikTok data.

SQL QUESTION 7: In SQL, Are NULL values the same as a zero or blank space?

NULLs are NOT the same as zero or blank spaces in SQL. NULLs are used to represent a missing value or the abscence of a value, whereas zero and blank space are legitimate values.

It's important to handle NULLs carefully, because they can mess up your analysis very easily. For example, if you compare a NULL value using the = operator, the result will always be NULL (because just like Drake, nothing be dared compared to NULL). That's why many data analysis in SQL start with removing NULLs using the function.

SQL Question 8: Analyzing User Behavior and Content Interactions on TikTok

You have been given access to the TikTok database and you are tasked with the following:

TikTok has two primary tables -- and . Each row in the table represents a unique user on the platform, while each row in the table represents a unique video that has been uploaded on the platform. A video can be uploaded by a user, and the same user can 'like' or 'share' other videos, including their own.

Write a SQL query that shows the top 5 Users who have uploaded the videos that have received the most 'likes'. The output should display the User ID, the total number of videos they have uploaded, and the total number of 'likes' their videos have collectively received.

The two tables are structured as follows:

Example Input:
user_idusernamecountryjoin_date
1'user1''USA''2021-01-01'
2'user2''Canada''2021-02-01'
3'user3''UK''2021-01-31'
4'user4''USA''2021-01-30'
5'user5''Canada''2021-01-15'
Example Input:
video_idupload_dateuser_idvideo_likes
101'2021-01-01'1500
102'2021-02-01'21000
103'2021-02-01'11500
104'2021-03-01'32000
105'2021-03-01'4250
106'2021-04-01'55000

Answer:


This SQL query first joins the 'Users' and 'Videos' tables on the field so that information about which user uploaded each video (from the 'Users' table) can be combined with information about each video (from the 'Videos' table).

The clause is then used to group the combined table by , so that the subsequent aggregation functions (COUNT and SUM) can be applied for each user. The COUNT function counts the total number of videos uploaded by the user, while the SUM function sums the total number of likes received by all of their videos.

Finally, the clause sorts the table based on the total number of likes, in descending order, and the clause displays only the top 5 users.

Since joins come up frequently during SQL interviews, try an interactive Spotify JOIN SQL question: Spotify JOIN SQL question

SQL Question 9: Video Interaction Analysis

Using the data from the video and user interactions on TikTok, calculate the average, maximum, and minimum durations of videos watched by users, rounded to the nearest whole number. Additionally, calculate the square root of the total number of likes given by users and present it as totalLikesSQRT. Assume we only have data for a single day.

Here are the data tables:

Example Input:
video_idduration_secs
00160
00245
00375
004120
00530
Example Input:
user_idvideo_idwatched_duration_secs
12300160
26500230
36200355
192004120
98100525
Example Input:
user_idvideo_idliked
123001TRUE
265002FALSE
362003TRUE
192004TRUE
981005TRUE
Example Output:
avg_watched_durationmax_watched_durationmin_watched_durationtotalLikesSQRT
58120252

Answer:


The above SQL code first calculates the average, max, and min watched durations per video rounding the average to the nearest whole number. We then join this with the user_likes table on both user_id and video_id to get the liked videos for each user. The likes are then calculated as a square root of the total likes in the database. Note, CASE statement is used inside the SUM to count 'TRUE' values in the 'liked' column.

In case this problem was difficult, strengthen your SQL foundations with this free SQL tutorial which has 30+ lessons including one on SQL math functions.

SQL Question 10: Videos Engagement Analysis

You are working as a Data Analyst for TikTok. Some videos go viral suddenly after a period of time. Your task is to find for each User_Id, the video (Video_Id) with the Maximun number of likes (Likes) per day (Date). Note that some users might have multiple videos in a day, and the result need to show only the first uploaded video in case of tie on likes count.

Example Input:
User_IdVideo_IdDateLikes
101VV5672022-10-01150
101VV2342022-10-0180
101VV8902022-10-01150
102VV1012022-10-01300
102VV1112022-10-01200
101VV1232022-10-02100
101VV4562022-10-02120
102VV7892022-10-02500

Answer:


This SQL query first assigns a unique row number within each group of User_Id and Date based on descending number of likes and ascending order of Video Id (to account for the same Like counts within a day for a user). This is achieved by using the Window Function 'ROW_NUMBER()'.

Then, we select only those rows where the row number is 1, i.e. we only pick the first video in case of tie on likes count. Hence, we get a row for each User_Id, Date representing the video with highest Likes count uploaded first for the day.

To solve a similar window function question on DataLemur's free interactive coding environment, try this Google SQL Interview Question:

Google SQL Interview Question

Preparing For The TikTok SQL Interview

Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the TikTok SQL interview is to solve as many practice SQL interview questions as you can! In addition to solving the earlier TikTok SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon. DataLemur Question Bank

Each DataLemur SQL question has multiple hints, fully explained answers along with a discussion board to see how others solved it and most importantly, there is an online SQL code editor so you can easily right in the browser your SQL query and have it checked.

To prep for the TikTok SQL interview it is also helpful to solve SQL problems from other like:

However, if your SQL foundations are weak, don't worry about diving straight into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.

Free SQL tutorial

This tutorial covers things like creating summary stats with GROUP BY and manipulating date/time data – both of these come up routinely during SQL job interviews at TikTok.

TikTok Data Science Interview Tips

What Do TikTok Data Science Interviews Cover?

For the TikTok Data Science Interview, in addition to SQL query questions, the other types of questions to prepare for are:

How To Prepare for TikTok Data Science Interviews?

The best way to prepare for TikTok Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from FAANG & startups
  • A Refresher covering Stats, ML, & Data Case Studies
  • Great Reviews (900+ reviews, 4.5-star rating)

Ace the Data Science Interview Book on Amazon

You should also read the blog "5 TikTok Data Science Interview Questions & Interview Prep Guide".

TikTok Data Science Interview Questions