11 Twitter SQL Interview Questions (Updated 2024)

Updated on

August 11, 2024

Twitter employees use SQL to analyze tweet patterns, identifying trending topics and user engagement metrics to enhance the user experience. They also extract relevant data on user demographics and interests to help business clients optimize their ad campaigns, which is why Twitter includes SQL questions during interviews for Data Science, Data Engineering, and Data Analytics jobs.

So, to help you study, here's 11 Twitter SQL interview questions – can you solve them?

Twitter SQL Interview Questions

11 Twitter SQL Interview Questions

SQL Question 1: Histogram of Tweets

This is the same question as problem #6 in the SQL Chapter of Ace the Data Science Interview!

Assume you're given a table Twitter tweet data, write a query to obtain a histogram of tweets posted per user in 2022. Output the tweet count per user as the bucket and the number of Twitter users who fall into that bucket.

In other words, group the users by the number of tweets they posted in 2022 and count the number of users in each group.

Table:

Column NameType
tweet_idinteger
user_idinteger
msgstring
tweet_datetimestamp

Example Input:

tweet_iduser_idmsgtweet_date
214252111Am considering taking Tesla private at $420. Funding secured.12/30/2021 00:00:00
739252111Despite the constant negative press covfefe01/01/2022 00:00:00
846402111Following @NickSinghTech on Twitter changed my life!02/14/2022 00:00:00
241425254If the salary is so competitive why won’t you tell me what it is?03/01/2022 00:00:00
231574148I no longer have a manager. I can't be managed03/23/2022 00:00:00

Example Output:

tweet_bucketusers_num
12
21

Answer:


To see an additional answer go to DataLemur's FREE interactive coding environment!

Twitter SQL Interview Question

SQL Question 2: User Session Activity

This is the same question as problem #24 in the SQL Chapter of Ace the Data Science Interview!

Assume you are given the table containing Twitter user session activities.

Write a query that ranks users according to their total session durations (in minutes) in descending order for each session type between the start date (2022-01-01) and the end date (2022-02-01).

Output the user ID, session type, and the ranking of the total session duration.

Table:

Column NameType
session_idinteger
user_idinteger
session_typestring ("like", "reply", "retweet")
durationinteger (in minutes)
start_datetimestamp

Example Input:

session_iduser_idsession_typedurationstart_date
6368111like312/25/2021 12:00:00
1742111retweet601/02/2022 12:00:00
8464222reply801/16/2022 12:00:00
7153111retweet501/28/2022 12:00:00
3252333reply1501/10/2022 12:00:00

Example Output:

user_idsession_typeranking
333reply1
222reply2
111retweet1

Answer:


To see how to solve this question using a subquery go check out this Twitter SQL Interview Question on Data Lemur!

Twitter SQL Interview Question

SQL Question 3: In SQL, are NULLs treated the same as zero's and blank spaces?

A NULL value represents a missing or unknown value. It is different from a zero or a blank space, which are actual values. It is important to handle NULL values properly in SQL because they can cause unexpected results if not treated correctly.

Twitter SQL Interview Questions

SQL Question 4: Analyze User Tweet Activity

You are given the following tables:

table that contains information of the users:

Example Input:
user_idusername
1@jack
2@elonmusk
3@billgates

table that contains information of the tweets:

Example Input:
tweet_iduser_idcontenttimestamp
1011"Hello Twitter!"2022-08-01 10:00:00
1022"Launch successful!"2022-08-01 11:00:00
1031"Acquiring new company..."2022-08-02 09:30:00
1043"Reading a great book."2022-08-02 15:00:00
1052"Mars, here we come!"2022-08-03 13:00:00

Write a SQL query to find the total number of tweets and the rank of total tweets for each user as of a specific timestamp (for example, '2022-08-02 12:00:00'). Order results by the total number of tweets in descending order, and if two users have the same number of tweets, order them by username in ascending order.

Example Output:
usernamenum_tweetsrank
@jack21
@elonmusk12
@billgates13

Answer:


In this query, we first join the table and table on and then filter for tweets with timestamp less than or equal to the given date and time. The result is then grouped by . Within the select clause, we count the number of tweets per user and calculate the rank of each user based on the number of tweets they made. If the number of tweets is the same, the users are ordered by their username in ascending order. Finally, we order the overall result by the number of tweets in descending order, and by username in ascending order if the number of tweets is the same.

Pro Tip: Window functions are a frequent SQL interview topic, so practice every window function problem on DataLemur

DataLemur SQL Questions

SQL Question 5: Why are foreign key's important in databases?

A foreign key is a field in a database table that serves as a reference to the primary key of another table, allowing for the creation of a relationship between the two tables.

For a concrete example, let's inspect employee data from Twitter's HR database:

:

employee_idfirst_namelast_namemanager_id
1AubreyGraham3
2MarshalMathers3
3DwayneCarter4
4ShawnCarter

In this table, is the primary key, and is used to uniquely identify each row.

could be a foreign key. It references the of the manager of each employee in the table, establishing a relationship between the employees and their managers. This foreign key allows you to easily query the table to find out who a specific employee's manager is, or to find out which employees report to a specific manager.

It is possible for a table to have multiple foreign keys that reference primary keys in different tables. For instance, the table could have additional foreign keys for the of the department where an employee works and the of the employee's location.

SQL Question 6: Calculate the Average Number of Retweets

For a social media platform like Twitter, one critical metric might be an understanding of how often tweets are retweeted. As part of your data analytics team, we want you to determine the average number of retweets for each user in our "Twitter" database. To simplify this, you can assume you're working with a month's data.

Sample data:

Example Input:
tweet_iduser_idtweet_dateretweets
101106/01/2022 10:00:005
102106/05/2022 14:24:003
103206/07/2022 18:00:0012
104306/12/2022 12:30:004
105206/27/2022 19:45:008
Example Output:
user_idavg_retweets
14.00
210.00
34.00

Answer:


This query calculates the average number of retweets for each user by grouping on the column and then applying the AVG function to the retweets column. Grouping by ensures that the average calculation is performed separately for each Twitter user. The result is a table that shows the average number of retweets for each user.

To practice a very similar question try this interactive Twitter Histogram of Tweets Question which is similar for requiring analysis of user tweets or this Twitter Tweets' Rolling Averages Question which is similar for involving tweet data aggregation per user.

SQL Question 7: What is the difference between a correlated subquery and non-correlated subquery?

A non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query. On the other hand, a correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query.

Here is an example of a non-correlated sub-query:


The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.

Here is an example of a correlated sub-query:


This query selects the and total sales of all Twitter customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().

Non-correlated sub-queries are typically faster, as they only have to be executed once, wheareas correlated sub-queries are slower since they have to be re-evaluated for each row of the outer query.

SQL Question 8: Evaluate the Click-Through Rates for Different Advertisement Campaigns on Twitter

Twitter uses advertisement campaigns to generate revenue and engage users. Assume that you are a data analyst at Twitter, and your task is to calculate the click-through rates for different advertisement campaigns. Specifically you need to find the , and for each campaign where is calculated as the total divided by the total multiplied by 100.

Here are the sample tables:

Example Input:
campaign_idname
1Campaign A
2Campaign B
3Campaign C
Example Input:
ad_idcampaign_idclicksnumber_of_impressions
10114502300
10221201900
10332002400
10412301800
10531902600

Answer:


This query first joins and on . It then groups the result set by and , and for each group, it calculates the click-through-rate by summing the and dividing it by the sum of , and multiplies the result by 100 to get the rate in percentage. This result gives us the click-through rates for each campaign.

To solve a similar problem about calculating rates, solve this TikTok SQL Interview Question within DataLemur's interactive coding environment:

Signup Activation Rate SQL Question

SQL Question 9: Find The Most Active User On Twitter

As a data analyst at Twitter, we would like you to find out which user has the most tweets in every month. For this exercise, assume that a table took the following form:

Example Input:
tweet_iduser_idtweet_datecontent
1098112306/08/2022 00:00:00Hello World
2650232206/10/2022 00:00:00Good Morning
7239312306/18/2022 00:00:00This is a tweet
9953298707/26/2022 00:00:00Loving the weather
7641732207/05/2022 00:00:00Happy Independence Day

Answer:

Here is a PostgreSQL query to solve the problem:


This will output a table where each row represents a month and displays who tweeted the most during that month along with the count of tweets by that user. If there is a tie in the maximum count of tweets by multiple users, all of those users will be returned by this query.

SQL Question 10: What are the different types of database indexes?

A database index is a data structure that improves the speed of data retrieval operations on a database table.

There are few different types of indexes that can be used in a database:

  • Primary index: a unique identifier is used to access the row directly.
  • Unique index: used to enforce the uniqueness of the indexed columns in a table.
  • Composite index: created on multiple columns of a table, is used to speed up the search process for multiple columns
  • Clustered index: determines the physical order of the data in a table

For a concrete example, say you had a table of Twitter customer payments with the following columns:

Here's what a clustered index on the column would look like:


A clustered index on the column would determine the physical order of the records in the table based on the . This means that the records with the earliest values would be stored together physically in the table, followed by records with later payment_date values. This speeds up queries that filter or sort the data based on the , as the records are already physically sorted in the table. For example, if you want to retrieve all the payments made in the month of June, the database can use the clustered index to quickly locate and retrieve the desired records without having to search through the entire table.

SQL Question 11: Filtering Twitter Usernames

Given a table named that contains data of all Twitter users, find all usernames that start with the letter 'a'.

Example Input:
user_idusernamesignup_datefollowers
1apple2015-06-2630420
2ball2016-07-1250000
3antic2019-09-0221000
4arc2018-03-1435000
5zebra2019-11-3020000
Example Output:
username
apple
antic
arc

Answer:


This retrieves all usernames from the table where the username starts with 'a'. The SQL keyword is used to filter records, and the '%' after 'a' is a wildcard that matches any sequence of characters. So 'a%' will match any string that starts with 'a'.

Twitter SQL Interview Tips

The key to acing a Twitter SQL interview is to practice, practice, and then practice some more! In addition to solving the earlier Twitter SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Google, Microsoft and Silicon Valley startups.

DataLemur Questions

Each SQL question has hints to guide you, step-by-step solutions and best of all, there's an online SQL code editor so you can easily right in the browser your SQL query answer and have it graded.

To prep for the Twitter SQL interview it is also a great idea to practice interview questions from other tech companies like:

However, if your SQL foundations are weak, forget about diving straight into solving questions – refresh your SQL knowledge with this free SQL tutorial.

DataLemur SQL tutorial

This tutorial covers things like CASE/WHEN/ELSE statements and functions like SUM()/COUNT()/AVG() – both of these show up routinely in Twitter SQL interviews.

Explore exciting career opportunities at Twitter and be part of shaping the future of social media!

Twitter Data Science Interview Tips

What Do Twitter Data Science Interviews Cover?

Beyond writing SQL queries, the other topics covered in the Twitter Data Science Interview include:

Twitter Data Scientist

How To Prepare for Twitter Data Science Interviews?

To prepare for the Twitter Data Science interview have a strong understanding of the company's culture and values – this will be important for acing the behavioral interview. For the technical Data Science interviews, get ready by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from Facebook, Google, & Amazon
  • A Crash Course on Python, SQL & ML
  • Amazing Reviews (1000+ reviews, 4.5-star rating)

Ace the Data Science Interview by Nick Singh Kevin Huo

© 2025 DataLemur, Inc

Career Resources

Free 9-Day Data Interview Crash CourseFree SQL Tutorial for Data AnalyticsSQL Interview Cheat Sheet PDFUltimate SQL Interview GuideAce the Data Job Hunt Video CourseAce the Data Science InterviewBest Books for Data AnalystsSQL Squid Game