11 Twitter SQL Interview Questions (Updated 2025)

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

January 16, 2025

Twitter employees use SQL to analyze tweet patterns, identifying trending topics and user engagement metrics to enhance the user experience. They also extract relevant data on user demographics and interests to help business clients optimize their ad campaigns, which is why Twitter includes SQL questions during interviews for Data Science, Data Engineering, and Data Analytics jobs.

So, to help you study, here's 11 Twitter SQL interview questions – can you solve them?

Twitter SQL Interview Questions

11 Twitter SQL Interview Questions

SQL Question 1: Histogram of Tweets

This is the same question as problem #6 in the SQL Chapter of Ace the Data Science Interview!

Assume you're given a table Twitter tweet data, write a query to obtain a histogram of tweets posted per user in 2022. Output the tweet count per user as the bucket and the number of Twitter users who fall into that bucket.

In other words, group the users by the number of tweets they posted in 2022 and count the number of users in each group.

Table:

Column Name	Type
tweet_id	integer
user_id	integer
msg	string
tweet_date	timestamp

Example Input:

tweet_id	user_id	msg	tweet_date
214252	111	Am considering taking Tesla private at $420. Funding secured.	12/30/2021 00:00:00
739252	111	Despite the constant negative press covfefe	01/01/2022 00:00:00
846402	111	Following @NickSinghTech on Twitter changed my life!	02/14/2022 00:00:00
241425	254	If the salary is so competitive why won’t you tell me what it is?	03/01/2022 00:00:00
231574	148	I no longer have a manager. I can't be managed	03/23/2022 00:00:00

Example Output:

tweet_bucket	users_num
1	2
2	1

Answer:

To see an additional answer go to DataLemur's FREE interactive coding environment!

SQL Question 2: User Session Activity

This is the same question as problem #24 in the SQL Chapter of Ace the Data Science Interview!

Assume you are given the table containing Twitter user session activities.

Write a query that ranks users according to their total session durations (in minutes) in descending order for each session type between the start date (2022-01-01) and the end date (2022-02-01).

Output the user ID, session type, and the ranking of the total session duration.

Table:

Column Name	Type
session_id	integer
user_id	integer
session_type	string ("like", "reply", "retweet")
duration	integer (in minutes)
start_date	timestamp

Example Input:

session_id	user_id	session_type	duration	start_date
6368	111	like	3	12/25/2021 12:00:00
1742	111	retweet	6	01/02/2022 12:00:00
8464	222	reply	8	01/16/2022 12:00:00
7153	111	retweet	5	01/28/2022 12:00:00
3252	333	reply	15	01/10/2022 12:00:00

Example Output:

user_id	session_type	ranking
333	reply	1
222	reply	2
111	retweet	1

Answer:

To see how to solve this question using a subquery go check out this Twitter SQL Interview Question on Data Lemur!

SQL Question 3: In SQL, are NULLs treated the same as zero's and blank spaces?

A NULL value represents a missing or unknown value. It is different from a zero or a blank space, which are actual values. It is important to handle NULL values properly in SQL because they can cause unexpected results if not treated correctly.

Twitter SQL Interview Questions

SQL Question 4: Analyze User Tweet Activity

You are given the following tables:

table that contains information of the users:

Example Input:

user_id	username
1	@jack
2	@elonmusk
3	@billgates

table that contains information of the tweets:

Example Input:

tweet_id	user_id	content	timestamp
101	1	"Hello Twitter!"	2022-08-01 10:00:00
102	2	"Launch successful!"	2022-08-01 11:00:00
103	1	"Acquiring new company..."	2022-08-02 09:30:00
104	3	"Reading a great book."	2022-08-02 15:00:00
105	2	"Mars, here we come!"	2022-08-03 13:00:00

Write a SQL query to find the total number of tweets and the rank of total tweets for each user as of a specific timestamp (for example, '2022-08-02 12:00:00'). Order results by the total number of tweets in descending order, and if two users have the same number of tweets, order them by username in ascending order.

Example Output:

username	num_tweets	rank
@jack	2	1
@elonmusk	1	2
@billgates	1	3

Answer:

In this query, we first join the table and table on and then filter for tweets with timestamp less than or equal to the given date and time. The result is then grouped by . Within the select clause, we count the number of tweets per user and calculate the rank of each user based on the number of tweets they made. If the number of tweets is the same, the users are ordered by their username in ascending order. Finally, we order the overall result by the number of tweets in descending order, and by username in ascending order if the number of tweets is the same.

Pro Tip: Window functions are a frequent SQL interview topic, so practice every window function problem on DataLemur

SQL Question 5: Why are foreign key's important in databases?

A foreign key is a field in a database table that serves as a reference to the primary key of another table, allowing for the creation of a relationship between the two tables.

For a concrete example, let's inspect employee data from Twitter's HR database:

:

employee_id	first_name	last_name	manager_id
1	Aubrey	Graham	3
2	Marshal	Mathers	3
3	Dwayne	Carter	4
4	Shawn	Carter

In this table, is the primary key, and is used to uniquely identify each row.

could be a foreign key. It references the of the manager of each employee in the table, establishing a relationship between the employees and their managers. This foreign key allows you to easily query the table to find out who a specific employee's manager is, or to find out which employees report to a specific manager.

It is possible for a table to have multiple foreign keys that reference primary keys in different tables. For instance, the table could have additional foreign keys for the of the department where an employee works and the of the employee's location.

SQL Question 6: Calculate the Average Number of Retweets

For a social media platform like Twitter, one critical metric might be an understanding of how often tweets are retweeted. As part of your data analytics team, we want you to determine the average number of retweets for each user in our "Twitter" database. To simplify this, you can assume you're working with a month's data.

Sample data:

Example Input:

tweet_id	user_id	tweet_date	retweets
101	1	06/01/2022 10:00:00	5
102	1	06/05/2022 14:24:00	3
103	2	06/07/2022 18:00:00	12
104	3	06/12/2022 12:30:00	4
105	2	06/27/2022 19:45:00	8

Example Output:

user_id	avg_retweets
1	4.00
2	10.00
3	4.00

Answer:

This query calculates the average number of retweets for each user by grouping on the column and then applying the AVG function to the retweets column. Grouping by ensures that the average calculation is performed separately for each Twitter user. The result is a table that shows the average number of retweets for each user.

To practice a very similar question try this interactive Twitter Histogram of Tweets Question which is similar for requiring analysis of user tweets or this Twitter Tweets' Rolling Averages Question which is similar for involving tweet data aggregation per user.

SQL Question 7: What is the difference between a correlated subquery and non-correlated subquery?

A non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query. On the other hand, a correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query.

Here is an example of a non-correlated sub-query:

The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.

Here is an example of a correlated sub-query:

This query selects the and total sales of all Twitter customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().

Non-correlated sub-queries are typically faster, as they only have to be executed once, wheareas correlated sub-queries are slower since they have to be re-evaluated for each row of the outer query.

SQL Question 8: Evaluate the Click-Through Rates for Different Advertisement Campaigns on Twitter

Twitter uses advertisement campaigns to generate revenue and engage users. Assume that you are a data analyst at Twitter, and your task is to calculate the click-through rates for different advertisement campaigns. Specifically you need to find the , and for each campaign where is calculated as the total divided by the total multiplied by 100.

Here are the sample tables:

Example Input:

campaign_id	name
1	Campaign A
2	Campaign B
3	Campaign C

Example Input:

ad_id	campaign_id	clicks	number_of_impressions
101	1	450	2300
102	2	120	1900
103	3	200	2400
104	1	230	1800
105	3	190	2600

Answer:

This query first joins and on . It then groups the result set by and , and for each group, it calculates the click-through-rate by summing the and dividing it by the sum of , and multiplies the result by 100 to get the rate in percentage. This result gives us the click-through rates for each campaign.

To solve a similar problem about calculating rates, solve this TikTok SQL Interview Question within DataLemur's interactive coding environment:

SQL Question 9: Find The Most Active User On Twitter

As a data analyst at Twitter, we would like you to find out which user has the most tweets in every month. For this exercise, assume that a table took the following form:

Example Input:

tweet_id	user_id	tweet_date	content
10981	123	06/08/2022 00:00:00	Hello World
26502	322	06/10/2022 00:00:00	Good Morning
72393	123	06/18/2022 00:00:00	This is a tweet
99532	987	07/26/2022 00:00:00	Loving the weather
76417	322	07/05/2022 00:00:00	Happy Independence Day

Answer:

Here is a PostgreSQL query to solve the problem:

This will output a table where each row represents a month and displays who tweeted the most during that month along with the count of tweets by that user. If there is a tie in the maximum count of tweets by multiple users, all of those users will be returned by this query.

SQL Question 10: What are the different types of database indexes?

A database index is a data structure that improves the speed of data retrieval operations on a database table.

There are few different types of indexes that can be used in a database:

Primary index: a unique identifier is used to access the row directly.
Unique index: used to enforce the uniqueness of the indexed columns in a table.
Composite index: created on multiple columns of a table, is used to speed up the search process for multiple columns
Clustered index: determines the physical order of the data in a table

For a concrete example, say you had a table of Twitter customer payments with the following columns:

Here's what a clustered index on the column would look like:

A clustered index on the column would determine the physical order of the records in the table based on the . This means that the records with the earliest values would be stored together physically in the table, followed by records with later payment_date values. This speeds up queries that filter or sort the data based on the , as the records are already physically sorted in the table. For example, if you want to retrieve all the payments made in the month of June, the database can use the clustered index to quickly locate and retrieve the desired records without having to search through the entire table.

SQL Question 11: Filtering Twitter Usernames

Given a table named that contains data of all Twitter users, find all usernames that start with the letter 'a'.

Example Input:

user_id	username	signup_date	followers
1	apple	2015-06-26	30420
2	ball	2016-07-12	50000
3	antic	2019-09-02	21000
4	arc	2018-03-14	35000
5	zebra	2019-11-30	20000

Example Output:

username
apple
antic
arc

Answer:

This retrieves all usernames from the table where the username starts with 'a'. The SQL keyword is used to filter records, and the '%' after 'a' is a wildcard that matches any sequence of characters. So 'a%' will match any string that starts with 'a'.

Twitter SQL Interview Tips

The key to acing a Twitter SQL interview is to practice, practice, and then practice some more! In addition to solving the earlier Twitter SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Google, Microsoft and Silicon Valley startups.

Each SQL question has hints to guide you, step-by-step solutions and best of all, there's an online SQL code editor so you can easily right in the browser your SQL query answer and have it graded.

To prep for the Twitter SQL interview it is also a great idea to practice interview questions from other tech companies like:

However, if your SQL foundations are weak, forget about diving straight into solving questions – refresh your SQL knowledge with this free SQL tutorial.

This tutorial covers things like CASE/WHEN/ELSE statements and functions like SUM()/COUNT()/AVG() – both of these show up routinely in Twitter SQL interviews.

Explore exciting career opportunities at Twitter and be part of shaping the future of social media!

Twitter Data Science Interview Tips

What Do Twitter Data Science Interviews Cover?

Beyond writing SQL queries, the other topics covered in the Twitter Data Science Interview include:

AB Testing Questions
Coding Questions in Python or R
Product-Sense and Business Case Questions
Machine Learning Questions
Behavioral Interview Questions

Twitter Data Scientist

How To Prepare for Twitter Data Science Interviews?

To prepare for the Twitter Data Science interview have a strong understanding of the company's culture and values – this will be important for acing the behavioral interview. For the technical Data Science interviews, get ready by reading Ace the Data Science Interview. The book's got:

201 Interview Questions from Facebook, Google, & Amazon
A Crash Course on Python, SQL & ML
Amazing Reviews (1000+ reviews, 4.5-star rating)

11 Twitter SQL Interview Questions (Updated 2025)

11 Twitter SQL Interview Questions

SQL Question 1: Histogram of Tweets

Table:

Example Input:

Example Output:

Answer:

SQL Question 2: User Session Activity

Table:

Example Input:

Example Output:

Answer:

SQL Question 3: In SQL, are NULLs treated the same as zero's and blank spaces?

SQL Question 4: Analyze User Tweet Activity

Example Input:

Example Input:

Example Output:

Answer:

SQL Question 5: Why are foreign key's important in databases?

:

SQL Question 6: Calculate the Average Number of Retweets

Example Input:

Example Output:

Answer:

SQL Question 7: What is the difference between a correlated subquery and non-correlated subquery?

SQL Question 8: Evaluate the Click-Through Rates for Different Advertisement Campaigns on Twitter

Example Input:

Example Input:

Answer:

SQL Question 9: Find The Most Active User On Twitter

Example Input:

Answer:

SQL Question 10: What are the different types of database indexes?

SQL Question 11: Filtering Twitter Usernames

Example Input:

Example Output:

Answer:

Twitter SQL Interview Tips

Twitter Data Science Interview Tips

What Do Twitter Data Science Interviews Cover?

How To Prepare for Twitter Data Science Interviews?

Career Resources

Support

Interview Questions