logo

11 iQIYI SQL Interview Questions (Updated 2024)

Updated on

March 2, 2024

At iQIYI, SQL is used frequently for querying and manipulating vast video streaming data, and generating insights for personalized content recommendations. So, it shouldn't surprise you that iQIYI often tests SQL questions in interviews for Data Science and Data Engineering positions.

As such, to help you ace the iQIYI SQL interview, we'll cover 11 iQIYI SQL interview questions – can you solve them?

11 iQIYI SQL Interview Questions

SQL Question 1: Identify the Top 10 Users with the Most Video Views

iQIYI is a popular streaming platform in China and hosts a wide variety of TV shows and Movies. As a valued member of the iQIYI data team, you've been tasked with identifying our 'Whale' users. These are the users who watch an extreme number of videos, making them highly valuable to the business.

Your task is to write a SQL query that identifies the top 10 users based on the number of videos watched in the last 30 days.

For this task, let's consider we have a table where each row corresponds to a single video view by a user.

Example Input:

activity_iduser_idview_datevideo_id
128110106/25/2022 00:00:0050098
596120106/26/2022 00:00:0050254
352710106/30/2022 00:00:0050098
873130207/01/2022 00:00:0050012
472110107/02/2022 00:00:0050254
597120107/02/2022 00:00:0050075
983130207/05/2022 00:00:0050098
292120107/08/2022 00:00:0050254
792210107/10/2022 00:00:0050075
618130207/12/2022 00:00:0050098

Answer:


This SQL query groups the table by , and counts the number of video views for each user. The query also filters the data to only consider views from the past 30 days. The result is then sorted in descending order to list users with the highest video views first and limited to top 10 users.

To practice a similar VIP customer analysis question on DataLemur's free online SQL coding environment, try this Microsoft Azure Cloud SQL Interview Question: Microsoft SQL Interview Question: Super Cloud Customer

SQL Question 2: Analyze Viewer Behavior Using Window Functions

iQIYI, often referred to as the Netflix of China, is a platform that provides an extensive library of TV shows and movies. As a data analyst at iQIYI, you are given a task to analyze viewer behavior.

You are given access to the and tables:

Example Input:
user_idshow_idtimestamp
10112022-07-25 19:30:50
10222022-07-26 20:40:10
10132022-07-26 21:45:15
10312022-07-26 22:00:00
10122022-07-27 23:10:25
10232022-07-28 00:20:35
Example Input:
show_idtitle
1The Big Bang Theory
2Game of Thrones
3Friends

Write a SQL query to find out which show has the highest number of active users on a given day. Define an 'active user' as someone who has viewed at least once on that day. Provide the date, show_id and show title.

Answer:


The first part of the query calculates the number of active users for each show and each day. The second part finds out which show has the most active users for each day using the RANK window function. Each row in the result represents the show with the most active users for a specific day.

To practice a related window function SQL problem on DataLemur's free online SQL coding environment, solve this Amazon SQL Interview Question: Amazon SQL Interview Question

SQL Question 3: Can you describe the meaning of a constraint in SQL in layman's terms?

Constraints are just rules your DBMS has to follow when updating/inserting/deleting data.

Say you had a table of iQIYI products and a table of iQIYI customers. Here's some example SQL constraints you'd use:

NOT NULL: This constraint could be used to ensure that certain columns in the product and customer tables, such as the product name and customer email address, cannot contain NULL values.

UNIQUE: This constraint could be used to ensure that the product IDs and customer IDs are unique. This would prevent duplicate entries in the respective tables.

PRIMARY KEY: This constraint could be used to combine the NOT NULL and UNIQUE constraints to create a primary key for each table. The product ID or customer ID could serve as the primary key.

FOREIGN KEY: This constraint could be used to establish relationships between the iQIYI product and customer tables. For example, you could use a foreign key to link the customer ID in the customer table to the customer ID in the product table to track which products each customer has purchased.

CHECK: This constraint could be used to ensure that certain data meets specific conditions. For example, you could use a CHECK constraint to ensure that iQIYI product prices are always positive numbers.

DEFAULT: This constraint could be used to specify default values for certain columns. For example, you could use a DEFAULT constraint to set the customer registration date to the current date if no value is provided when a new customer is added to the database.

iQIYI SQL Interview Questions

SQL Question 4: Online Video Viewing Patterns

iQIYI, a leading online video platform in China, is highly interested in understanding its users' video viewing patterns. The company wants to analyze the most-watched genres, peak viewing times, and the average duration viewers watch videos.

You have access to two tables: and

table records every viewing event by users:

viewing_iduser_idvideo_idstart_timeviewing_duration
218312345987606/21/2022 18:00:0020
245611452523406/22/2022 20:00:0030
326512873987606/23/2022 21:00:0045
228515698176206/24/2022 22:00:0015
125767859523406/26/2022 18:00:0035

table holds video information:

video_idtitlegenreduration
9876InceptionSci-Fi120
5234FriendsComedy30
1762Frozen IIAnimation90

In this exercise:

  1. Identify the most watched genre.
  2. Identify the day of the week when users watch the most videos.
  3. Calculate the average viewing duration per user.

Note: All time references are in the 'YYYY/MM/DD HH24:MI:SS' string format.

Answer:

This problem can be solved with the following PostgreSQL queries.


Each of these queries targets a specific aspect of user viewing patterns - most watched genre, peak viewing times, and average viewing duration. By using appropriate joins, aggregations, and order conditions, we can extract meaningful insights from the data for iQIYI.

SQL Question 5: What's the difference between and ?

The clause is used to filter rows from the result set of a , , or statement. It allows you to specify a condition that must be met for a row to be included in the result set.

The clause is used to filter groups created by the clause. It is similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.

Say you were working on a social media analytics project for iQIYI.

Here is an example of a SQL query that you might write which uses both the WHERE and HAVING clauses:


This query retrieves the total impressions and average conversions for each platform in the table, the date of the campaign is in January 2023. The rows are grouped by platform and the clause filters the groups to include only those with more than 5000 impressions and an average conversion rate above 0.2.

SQL Question 6: Filtering Active Users on iQIYI

iQIYI is a popular streaming platform in China, often thought of as the Netflix of China. Let's assume you're working with their database of user activity, which contains one record per view of a show by a user.

Your task is to write a SQL query to obtain a list of users who have watched at least 3 different shows over the past month, and whose last watched show was a specified show "Endless Love". The columns in the database are , and .

Example Input:
user_idshow_idview_date
18002022-11-01
18102022-11-05
18502022-11-07
28002022-11-04
28502022-11-06
38002022-11-08
38102022-11-10
38502022-11-15
38002022-11-18
Example Output:
user_id
1
3

Answer:

Here's one way to write that query using PostgreSQL syntax:


In this query, the subquery creates a table where each row is numbered within each in reverse chronological order. The desired output is then selected from this subquery by filtering for the rows where , , and meet the specified conditions.

In detail, the conditions are that for each user, the last watched show () must be "Endless Love" (), and the user must have watched at least 3 different shows over the past month (). You can replace 800 with the show_id of "Endless Love" if it's different.

SQL Question 7: What do primary keys do?

The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.

For example, say you had stored some Facebook ad campaign data that iQIYI ran:


The CampaignID column is used to uniquely identify each row in the table, and the constraint ensures that there are no duplicate CampaignID values. This helps to maintain the accuracy of the data by preventing duplicate rows. The primary key is also an important part of the table because it enables you to easily identify and reference specific campaigns in your Facebook Ad data. You can use it to join to other tables in the database, such as a table with data on the results of the campaigns.

SQL Question 8: Click-Through and Conversion Rates for iQIYI

As a part of the marketing team at iQIYI, a leading online entertainment service company, you're tasked to monitor and improve the effectiveness of the company's online ads and product listings. From the 'clicks' and 'cart' tables provided, calculate the click-through rate (CTR) for each ad, and the conversion rate of each product listing. In addition, include in your results only those ads and listings that have received at least 100 clicks/views.

Example Input:
click_iduser_idtimestampad_id
100112306/08/2022 00:00:0050001
100226506/09/2022 00:05:0050001
100336206/08/2022 00:00:0069852
100419207/26/2022 00:08:0069852
100598107/05/2022 00:10:0069852
Example Input:
entry_iduser_idtimestampproduct_id
900112306/08/2022 00:01:0050001
900226506/10/2022 00:06:0050001
900336206/18/2022 00:01:0069852
900419207/26/2022 00:30:0069852
900598107/05/2022 00:35:0069852

Answer:


This query starts by creating two subqueries - 'ad_clicks' and 'add_to_cart' - to count the total number of clicks per ad and total number of items added to the cart per product id respectively. Then, it joins these subqueries on the ad_id (from 'ad_clicks') and product_id (from 'add_to_cart'). Finally, it calculates the click-through rate (ctr) and the conversion rate, including in the output only those ads and listings that have received at least 100 clicks/views.

To practice a related SQL problem on DataLemur's free online SQL code editor, attempt this SQL interview question asked by Facebook: Facebook App CTR SQL Interview question

SQL Question 9: Analyzing Customer Viewing and Subscription Data

You are given two tables, and . The table contains logs for each time iQIYI's customers interact with different shows and the contains the customer's subscription status including their subscription tier and subscription date.

You are tasked to write a SQL query to get the list of customers who have a premium subscription and have viewed the show 'Go Go Squid!' data. Include in your query the customer's id, the show id, the date viewed, the date of subscription to the premium tier, and the subscription tier.

Example Input:
customer_idshow_iddate_viewed
10012002019-07-22
10023002019-07-23
10034002019-07-24
10012002019-07-25
10023002019-07-26
Example Input:
customer_idsubscription_tierdate_subscribed
1001Premium2019-07-20
1002Standard2019-07-21
1003Premium2019-07-22
1004Standard2019-07-23
1005Premium2019-07-24

Answer:


The SQL query does a join on the and tables based on the customer_id. A filter ( clause) is then applied to extract only premium customers who viewed 'Go Go Squid!'. The statement outlines the columns to be returned.

Because join questions come up frequently during SQL interviews, try this interactive Snapchat SQL Interview question using JOINS: Snapchat JOIN SQL interview question

SQL Question 10: What's a correlated sub-query? How does it differ from a non-correlated sub-query?

A correlated sub-query is one that is linked to the outer query and cannot be executed on its own. It uses the outer query to filter or transform data by referencing a column from the outer query, while the outer query uses the results of the inner query. On the other hand, a non-correlated sub-query is independent of the outer query and can be executed on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query.

Correlated sub-queries are slower to execute, as they have to be re-evaluated for each row of the outer query, while non-correlated sub-queries are faster, as they only have to be executed once.

SQL Question 11: Calculate User AVG Engagement Score and User Behavior Category

iQIYI is a massive video platform with thousands of users engaging with content on a daily basis. Suppose you have the following database table, 'user_engagement', which contains activity data which includes "user_id", "session_id", "time_in_seconds" (time spent on each session), and "actions" (total actions taken in each session, such as likes, shares, comments, etc).

User engagement score is calculated as the sum of time_in_seconds and the square root of the actions for each session, all divided by the total number of sessions. Additionally, a user is categorized based on their avg_score, using the floor function for rounding: If the avg_score >=9, they are categorized as 'High'; avg_score between 5 to 8, categorized as 'Medium' else 'Low'.

Your task is to write a query that calculates the average engagement score for each user, round off to 2 decimal points and categorizes them based on the given rules.

Example Input:
user_idsession_idtime_in_secondsactions
1001S0112025
1001S0245030
1001S0324015
2002S0436018
2002S0530024
2002S0618016
3003S071508
Example Output:
user_idavg_scoreuser_category
100110.17High
20028.33Medium
30035.00Medium

Answer:


This query first calculates the score for each session for each user in the subquery, then it calculates the average score for each user and classifies them into categories based on the average score. It uses the PostgreSQL math functions such as SQRT(), FLOOR(), and the arithmetic operations '/', '+', and AVG(), alongside GROUP BY to group results by user_id.

To practice a very similar question try this interactive Snapchat Sending vs. Opening Snaps Question which is similar for analyzing time spent on platform or this Twitter Histogram of Tweets Question which is similar for understanding user activity.

How To Prepare for the iQIYI SQL Interview

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Besides solving the above iQIYI SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Microsoft, Google, Amazon, and tech startups. DataLemur Questions

Each interview question has multiple hints, full answers and crucially, there is an interactive SQL code editor so you can right in the browser run your SQL query answer and have it executed.

To prep for the iQIYI SQL interview you can also be a great idea to practice interview questions from other tech companies like:

In case your SQL coding skills are weak, don't worry about jumping right into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Analytics.

Interactive SQL tutorial

This tutorial covers things like transforming strings with CONCAT()/LOWER()/TRIM() and aggregate window functions – both of which come up frequently during iQIYI SQL interviews.

iQIYI Data Science Interview Tips

What Do iQIYI Data Science Interviews Cover?

In addition to SQL interview questions, the other types of questions to prepare for the iQIYI Data Science Interview are:

  • Probability & Stats Questions
  • Python or R Programming Questions
  • Analytics and Product-Metrics Questions
  • ML Modelling Questions
  • Behavioral & Resume-Based Questions

iQIYI Data Scientist

How To Prepare for iQIYI Data Science Interviews?

The best way to prepare for iQIYI Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from companies like Microsoft, Google & Amazon
  • A Refresher on Python, SQL & ML
  • Amazing Reviews (900+ 5-star reviews on Amazon)

Ace the Data Science Interview by Nick Singh Kevin Huo