Back to questions

Histogram of Tweets Twitter SQL Interview Question

Histogram of Tweets

Twitter SQL Interview Question

This is the same question as problem #6 in the SQL Chapter of Ace the Data Science Interview!

Assume you're given a table Twitter tweet data, write a query to obtain a histogram of tweets posted per user in 2022. Output the tweet count per user as the bucket and the number of Twitter users who fall into that bucket.

In other words, group the users by the number of tweets they posted in 2022 and count the number of users in each group.

Table:

Column NameType
tweet_idinteger
user_idinteger
msgstring
tweet_datetimestamp

Example Input:

tweet_iduser_idmsgtweet_date
214252111Am considering taking Tesla private at $420. Funding secured.12/30/2021 00:00:00
739252111Despite the constant negative press covfefe01/01/2022 00:00:00
846402111Following @NickSinghTech on Twitter changed my life!02/14/2022 00:00:00
241425254If the salary is so competitive why won’t you tell me what it is?03/01/2022 00:00:00
231574148I no longer have a manager. I can't be managed03/23/2022 00:00:00

Example Output:

tweet_bucketusers_num
12
21

Explanation:

Based on the example output, there are two users who posted only one tweet in 2022, and one user who posted two tweets in 2022. The query groups the users by the number of tweets they posted and displays the number of users in each group.

The dataset you are querying against may have different input & output - this is just an example!

Input

(PostgreSQL)

Output