9 NetEase SQL Interview Questions - How Many Can You Solve?

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

March 24, 2025

At NetEase, analysts write SQL queries to analyze user behavior data from NetEases's 250+ games. They also transitioned from MySQL to TiDB to store player and company data. Because of this, NetEase often tests SQL coding questions in interviews for Data Science and Data Engineering positions.

In case you're stressed about an upcoming SQL Assessment, we've collected 9 NetEase SQL interview questions to practice, which are similar to commonly asked questions at NetEase – able to answer them all?

9 NetEase SQL Interview Questions

SQL Question 1: Compute Average User Rating for Each Game by Month

NetEase, being a major provider of online PC and mobile games, collects user reviews for its various games. For review analysis, one typical requirement might be to compute the average user ratings for each game by month.

Here is your task: You are given a table where each row is a review record of a particular game () submitted by a user (). The table has the following schema:

Example Input

review_id	user_id	submit_date	game_id	stars
6171	123	06/01/2022 00:00:00	GameA	4
7802	456	06/02/2022 00:00:00	GameB	5
5293	789	07/01/2022 00:00:00	GameA	3
6352	123	07/02/2022 00:00:00	GameB	4
4517	456	07/03/2022 00:00:00	GameA	2

Write a SQL query that computes the average rating () for each game () by month (). Assume that is a timestamp, and we only care about the year and month (not the day). The output should be a table with columns for the month, game, and average rating, and should be sorted by month and then game.

Answer:

The above SQL query groups the reviews by both the month of and . Within each group, it calculates the average to represent the average rating of that game in that month.

Example Output

mth	game	avg_stars
06/01/2022 00:00:00	GameA	4.00
06/01/2022 00:00:00	GameB	5.00
07/01/2022 00:00:00	GameA	2.50
07/01/2022 00:00:00	GameB	4.00

For more window function practice, solve this Uber SQL Interview Question within DataLemur's online SQL code editor:

SQL Question 2: Analyzing Game Performance Based on User Reviews

NetEase, a leading provider of online services including games, would want to understand the performance of its games based on user reviews. Could you design a database model and SQL query to find out the average user review ratings of each game for each month?

Assume that we have following tables:

Example Input:

game_id	game_name	launch_date
101	Fantasy World	01/01/2021 00:00:00
102	Warrior's Quest	06/01/2022 00:00:00
103	Modern Strategist	06/01/2022 00:00:00

Example Input:

review_id	user_id	submit_date	game_id	stars
201	123	06/08/2022 00:00:00	101	4
202	265	06/10/2022 00:00:00	102	4
203	362	06/18/2022 00:00:00	101	3
204	192	07/26/2022 00:00:00	102	3
205	981	07/05/2022 00:00:00	103	5

We need to provide the average review ratings of each game for each month.

Answer:

The SQL block above first joins the table with table based on . Then it groups the records based on the submission month of review and game name, and calculates the average number of stars for each group. The result is ordered by the month in a descending order.

SQL Question 3: How do and differ when it comes to ranking rows in a result set?

While both and are used to rank rows, the key difference is in how they deal with ties.

RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the 2nd row in the tie, and a rank of 4 to the the 3rd tie.

DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.

Suppose we had data on how many deals different salespeople at NetEase:

To rank these salespeople, we could execute the following query:

The result of this query would be:

name	deals_closed	rank	dense_rank
Akash	50	1	1
Brittany	50	2	1
Carlos	40	3	2
Dave	40	4	3
Eve	30	5	3
Farhad	10	6	4

As you can see, the function assigns a rank of 1 to the first row (Akash), a rank of 2 to the second row (Brittany), and so on. In contrast, the , gives both Akash and Brittany a rank of 1, since they are tied, then moves on to rank 2 for Carlos.

NetEase SQL Interview Questions

SQL Question 4: Filter Customer Records by Purchase Behavior and Account Status

NetEase, a major internet company based in China, manages a customer records database. They want to filter down this database to select specific customers based on their purchase behavior and account status.

They have two tables, and . The table contains details about the customer including their current account status. The table records the individual transactions made by customers.

Example Input:

customer_id	account_status	name	email
1001	active	John Doe	john.doe@example.com
1002	inactive	Jane Smith	jane.smith@example.com
1003	active	Mary Johnson	mary.johnson@example.com
1004	block	Charles Williams	charles.williams@example.com

Example Input:

transaction_id	customer_id	product_id	amount	purchased_date
1456	1001	301	200	01/01/2023
2387	1002	402	300	01/01/2023
3829	1001	301	200	01/02/2023
4129	1003	303	150	01/02/2023

Question: Write a SQL query to select the and of customers who have an 'active' status in the table and have spent more than $200 on product 301 in the table.

Answer:

This query first generates a subquery that calculates the total amount spent by each customer on product 301. It then joins this subquery with the table on the and filters the results based on the account status and the total amount spent. This gives us the customers with an active account status who have spent more than $200 on the specified product.

SQL Question 5: Can you describe the difference between a correlated and a non-correlated sub-query?

A correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query. On the other hand, a non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query.

An example correlated sub-query:

This correlated subquery retrieves the names and salaries of NetEase employees who make more than the average salary for their department. The subquery references the department column in the main query's FROM clause (e1.department) and uses it to filter the rows of the subquery's FROM clause (e2.department).

An example non-correlated sub-query:

This non-correlated subquery retrieves the names and salaries of NetEase employees who make more than the average salary for the Data Analytics department (which honestly should be very few people since Data Analysts are awesome).

Anways, the subquery is independent of the main query and can be considered a standalone query. Its output (the average salary for the Data Analytics department) is then used in the main query to filter the rows of the table.

SQL Question 6: NetEase Click-Through-Rate Analysis

Given a database with information on the ad impressions, click-throughs, and conversions of NetEase digital products, calculate and present the average click-through rates (CTR) and conversion rates per product.

We have two tables, and . The table has the , , and for each impression. In the table, we document each click's , , the it occurred, whether it to a sale (1 indicating yes, 0 indicating no), and the .

Example Input:

impression_id	time	product_id
1	2022/07/29 09:20:00	001
2	2022/07/30 11:42:00	002
3	2022/07/31 13:17:00	003
4	2022/08/01 15:15:00	001
5	2022/08/02 08:12:00	002

Example Input:

click_id	impression_id	time	converted	product_id
1	1	2022/07/29 09:22:00	1	001
2	2	2022/07/30 11:50:00	0	002
3	3	2022/07/31 13:20:00	1	003
4	4	2022/08/01 15:18:00	0	001
5	5	2022/08/02 08:15:00	1	002

Answer:

This query first computes the CTR by dividing the number of distinct clicks by the number of distinct impressions for each product. Then it calculates the conversion rates by dividing successful conversions by the number of distinct clicks. The final step is joining the CTR and conversion rate data together based on .

To practice a related SQL problem on DataLemur's free interactive coding environment, attempt this Meta SQL interview question:

SQL Question 7: What are the similarities and differences between a clustered index and non-clustered index?

A clustered index is an index where the order of the rows in the database corresponds to the order of the rows in the index. Because of this, a table can only have one clustered index, but it can have multiple non-clustered indexes.

The main difference between the two is that the database tries to keep the data in the same order as the corresponding keys in the clustered index. This can improve the performance of most query operations, as it provides a linear-access path to the data stored in the database.

SQL Question 8: Calculate Monthly Product Revenue and Rounded Average Sales Price

NetEase is an international technology company and runs a large e-commerce website. Each product they sell has both a sale price and a cost price, and they can sell multiple products in a day. NetEase would like you to write a SQL query to calculate the total revenue, the average sale price and the average profit margin for each product per month. The profit margin should be calculated as , rounded to four decimal places.

The sale_price and cost_price columns are stored in a two decimal number format. The dates are stored in the format "DD/MM/YYYY".

Example Input:

sales_id	product_id	sale_date	sale_price	cost_price
621	50001	06/08/2022	100.00	75.00
422	69852	06/10/2022	200.00	150.00
953	50001	06/18/2022	105.00	74.00
364	69852	07/26/2022	195.00	145.00
517	69852	07/05/2022	198.00	152.00

Example Output:

mth	product	revenue	avg_sale_price	avg_profit_margin
6	50001	205.00	102.50	0.2667
6	69852	200.00	200.00	0.2500
7	69852	393.00	196.50	0.2598

Answer:

This SQL query uses aggregate functions and to calculate total revenue and average sale price. The function is used to limit the decimal places for average sale price and profit margin. The function extracts the month from each sale_date. The result is grouped by both month and product_id.

To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating sales related metrics, or this Wayfair Y-on-Y Growth Rate Question which is similar for requiring use of window functions and time-based grouping.

SQL Question 9: User Engagement Levels Over Time

As part of an analysis on user engagement, NetEase, a leading game developing company, wants to calculate the monthly average time users spend playing each game. You have been provided with a 'gaming_sessions' table where each row represents a distinct gaming session, which includes the user_id, game_id, session_start and session_end dates/times.

To assist NetEase, can you write a SQL query which gives the game_id along with the monthly average time spent (in minutes) playing each game?

Example Input:

session_id	user_id	game_id	session_start	session_end
101	123	3001	01/08/2022 10:00:00	01/08/2022 11:30:00
102	380	5002	02/01/2022 20:00:00	02/01/2022 20:45:00
103	477	3001	03/02/2022 16:30:00	03/02/2022 18:00:00
104	290	3001	29/01/2022 09:00:00	29/01/2022 11:00:00
105	123	5002	30/03/2022 14:00:00	30/03/2022 15:10:00

Answer:

This query calculates the time difference between session_end and session_start in minutes by using AGE() function and EXTRACT(EPOCH FROM..). Then it groups by game_id and month number (extracted from session_start) to get the average session length.

NetEase SQL Interview Tips

The best way to prepare for a NetEase SQL interview is to practice, practice, practice. In addition to solving the above NetEase SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Google, Uber, and Microsoft.

Each problem on DataLemur has hints to guide you, step-by-step solutions and crucially, there's an online SQL code editor so you can right online code up your SQL query answer and have it checked.

To prep for the NetEase SQL interview you can also be a great idea to practice SQL questions from other tech companies like:

However, if your SQL skills are weak, don't worry about diving straight into solving questions – refresh your SQL knowledge with this free SQL for Data Analytics course.

This tutorial covers SQL concepts such as using LIKE and handling timestamps – both of these pop up often in NetEase SQL assessments.

NetEase Data Science Interview Tips

What Do NetEase Data Science Interviews Cover?

For the NetEase Data Science Interview, in addition to SQL query questions, the other types of questions to prepare for are:

Statistics and Probability Questions
Python or R Coding Questions
Data Case Study Questions
Machine Learning Questions
Behavioral Based Interview Questions

NetEase Data Scientist

How To Prepare for NetEase Data Science Interviews?

To prepare for NetEase Data Science interviews read the book Ace the Data Science Interview because it's got:

201 interview questions sourced from tech companies like Google & Microsoft
a crash course on SQL, Product-Sense & ML
over 900+ 5-star reviews on Amazon

9 NetEase SQL Interview Questions - How Many Can You Solve?

9 NetEase SQL Interview Questions

SQL Question 1: Compute Average User Rating for Each Game by Month

Example Input

Answer:

Example Output

SQL Question 2: Analyzing Game Performance Based on User Reviews

Example Input:

Example Input:

Answer:

SQL Question 3: How do and differ when it comes to ranking rows in a result set?

SQL Question 4: Filter Customer Records by Purchase Behavior and Account Status

Example Input:

Example Input:

Answer:

SQL Question 5: Can you describe the difference between a correlated and a non-correlated sub-query?

SQL Question 6: NetEase Click-Through-Rate Analysis

Example Input:

Example Input:

Answer:

SQL Question 7: What are the similarities and differences between a clustered index and non-clustered index?

SQL Question 8: Calculate Monthly Product Revenue and Rounded Average Sales Price

Example Input:

Example Output:

Answer:

SQL Question 9: User Engagement Levels Over Time

Example Input:

Answer:

NetEase SQL Interview Tips

NetEase Data Science Interview Tips

What Do NetEase Data Science Interviews Cover?

How To Prepare for NetEase Data Science Interviews?

Career Resources

Support

Interview Questions