At JOYY, SQL is used all the damn time for analyzing user engagement with the platform and optimizing live-stream video delivery through targeted data queries and manipulation. Because of this, JOYY almost always evaluates jobseekers on SQL problems in interviews for Data Science and Data Engineering positions.
To help you ace the JOYY SQL interview, this blog covers 8 JOYY SQL interview questions – able to solve them?
JOYY Inc is a social media platform company. They are interested in understanding the daily active users' engagement with their live streaming function. They define engagement as the total duration that a user spends watching live streams in a day.
They request you to write a SQL query to calculate the daily average engagement duration per user for the last seven days.
In the table, the column represents the time that a user starts watching a live stream, and the column represents the time a user stops watching a live stream.
engagement_id | user_id | start_time | end_time |
---|---|---|---|
1 | 123 | 06/12/2022 18:30:00 | 06/12/2022 18:45:00 |
2 | 123 | 06/12/2022 19:10:00 | 06/12/2022 20:00:00 |
3 | 456 | 06/12/2022 14:50:00 | 06/12/2022 15:20:00 |
4 | 123 | 06/13/2022 08:00:00 | 06/13/2022 08:30:00 |
5 | 456 | 06/13/2022 16:00:00 | 06/13/2022 16:40:00 |
6 | 789 | 06/14/2022 20:00:00 | 06/14/2022 21:50:00 |
date | avg_duration |
---|---|
06/12/2022 | 65 |
06/13/2022 | 35 |
06/14/2022 | 110 |
The query calculates the duration of each user engagement by subtracting the start_time from the end_time. To do this, the query uses to get the duration in seconds, then divide it by 60 to convert the duration to minutes. The query averages this duration per day across all user engagements that happened within the last seven days. It does this using .
For more window function practice, solve this Uber SQL Interview Question within DataLemur's interactive coding environment:
As a data analyst at JOYY, you've been asked to analyze the user activities from a large user base data. Write a SQL query that filters the user activity database such that only records of users located in 'USA' and have 'active' status are included. You should further filter these records to those users who have spent more than 100 hours on the platform and have registered in the year 2020 or later.
user_id | country | status | hours_spent | registration_date |
---|---|---|---|---|
101 | USA | active | 150 | 2020-11-07 |
102 | Canada | active | 200 | 2020-02-03 |
103 | USA | inactive | 50 | 2021-01-25 |
104 | USA | active | 110 | 2021-05-02 |
105 | Germany | active | 120 | 2020-03-05 |
106 | USA | active | 90 | 2021-09-01 |
This query first filters the data based on the 'country' and 'status' column. It then filters the remaining data by comparing the 'hours_spent' column to 100 and the year part of the 'registration_date' to 2020.
The EXTRACT function is used in PostgreSQL to get the year part of the 'registration_date'. It will return records with registration_date in the year 2020 or later.
A is a field in a table that references the of another table. It creates a link between the two tables and ensures that the data in the field is valid.
Say for example you had sales analytics data from JOYY's CRM (customer-relationship management) tool.
The FOREIGN KEY constraint ensures that the data in the field of the "opportunities" table is valid, and prevents the insertion of rows in the table that do not have corresponding entries in the table. It also helps to enforce the relationship between the two tables and can be used to ensure that data is not deleted from the accounts table if there are still references to it in the opportunities` table.
JOYY, a global video-based social media platform, has recently run a global ad campaign with a goal to increase user engagement on their platform. As a Data Analyst, your task is to analyze the click-through-rate (CTR) of these ads for a specified period. JOYY's ad data is split into two tables, and .
The table contains a record for each time an ad was displayed to a user. It has the following columns:
The table contains a record for each time a displayed ad was clicked by the user. It has the following columns:
Example Input:
display_id | user_id | ad_id | display_date |
---|---|---|---|
9812 | 748 | 6001 | 06/01/2022 00:00:00 |
3765 | 365 | 4422 | 06/01/2022 00:00:00 |
6129 | 245 | 6001 | 06/02/2022 00:00:00 |
9257 | 907 | 4422 | 06/02/2022 00:00:00 |
4217 | 689 | 6001 | 06/02/2022 00:00:00 |
Example Input:
click_id | display_id | click_date |
---|---|---|
1123 | 3765 | 06/01/2022 00:00:00 |
2342 | 9257 | 06/02/2022 00:00:00 |
3452 | 4217 | 06/03/2022 00:00:00 |
This query first joins the and tables on the column, then groups the results by the column. The result is a table that shows the number of times each ad was displayed, the number of clicks each ad received, and the calculated click-through-rate for each ad.
To solve a similar problem about calculating rates, solve this TikTok SQL Interview Question within DataLemur's interactive coding environment:
Both and are used to combine the results of two or more SELECT statements into a single result set.
However, only includes one instance of a duplicate, whereas includes duplicates.
JOYY is a global social media platform that provides live streaming services. Bearing this in mind, write an SQL query that fetches the customer records whose names end with 'SONG'. As a Data Analyst at JOYY, you are required to dig into customer data and provide useful insights.
customer_id | first_name | last_name | sign_up_date | country | |
---|---|---|---|---|---|
1 | John | Song | johnsong@joyy.com | 01/05/2022 | USA |
2 | Dave | Lee | davelee@joyy.com | 03/10/2022 | Canada |
3 | Tim | Song | timsong@joyy.com | 08/07/2022 | UK |
4 | Emma | Wood | emmawood@joyy.com | 05/02/2022 | Australia |
5 | Peter | Johnson | peterjohnson@joyy.com | 16/09/2022 | USA |
customer_id | first_name | last_name |
---|---|---|
1 | John | Song |
3 | Tim | Song |
5 | Peter | Johnson |
You can use the SQL keyword for this problem to find records that match the string ending with 'SONG'. The keyword allows you to use wildcards as part of your search string. In SQL, the percent sign can be used as a wildcard representing zero, one, or multiple characters. The SQL query should look like this:
In this SQL statement, matches any string ending with 'SONG'. Thus, it fetches all customers whose last names end with 'SONG'.
{#Question-7}
is used to combine the output of multiple statements into one big result!
For a concrete example, say you were a Data Analyst supporting the Sales Analytics team at JOYY, and data on potential sales leads lived in both Salesforce CRM and Hubspot. To write a query to analyze leads created after 2023 started, across both CRMs, you could use in the following way:
"
JOYY, being a global video-based social media platform, focuses heavily on user engagement. As a Data Analyst at JOYY, you are tasked with finding the number of unique active users each month. Active users are considered as those who have posted any content (videos, comments etc.) during the month.
We will be using two tables: , and .
user_id | signup_date | country |
---|---|---|
110 | 2019-01-15 | Canada |
215 | 2019-06-27 | USA |
365 | 2019-09-10 | Australia |
527 | 2020-03-04 | Canada |
642 | 2021-01-20 | USA |
post_id | user_id | post_date |
---|---|---|
4151 | 110 | 2022-06-03 |
9823 | 215 | 2022-06-15 |
6296 | 365 | 2022-07-11 |
8153 | 527 | 2022-07-23 |
7192 | 642 | 2022-07-31 |
You need to write the query in PostgreSQL.
In the answer, we are using PostgreSQL's function to extract year and month from column of table. Then, we are simply grouping by year and month and counting distinct - this will give the number of unique active users in each month. The clause is used to keep the result set ordered by Year and Month.
The best way to prepare for a JOYY SQL interview is to practice, practice, practice. In addition to solving the earlier JOYY SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Facebook, Google, and VC-backed startups.
Each problem on DataLemur has hints to guide you, full answers and crucially, there is an online SQL coding environment so you can easily right in the browser your SQL query and have it graded.
To prep for the JOYY SQL interview you can also be helpful to solve SQL problems from other tech companies like:
In case your SQL query skills are weak, don't worry about jumping right into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.
This tutorial covers SQL concepts such as filtering on multiple conditions using AND/OR/NOT and rank window functions – both of these show up often in JOYY SQL interviews.
In addition to SQL query questions, the other types of questions covered in the JOYY Data Science Interview are:
To prepare for JOYY Data Science interviews read the book Ace the Data Science Interview because it's got: