11 Chindata SQL Interview Questions (Updated 2025)

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

February 19, 2025

At Chindata, SQL is crucial for extracting and manipulating customer data for insights and designing databases to house server-related information. That's why Chindata frequently asks SQL query questions during interviews for Data Analyst, Data Science, and BI jobs.

As such, to help you ace the Chindata SQL interview, here’s 11 Chindata SQL interview questions – how many can you solve?

11 Chindata SQL Interview Questions

SQL Question 1: Identifying Power Users

Identify the top 5 power users in "Chindata" company based on the total number of transactions made within the last 30 days. Power Users are defined as users who have made at least 10 transactions within the last 30 days.

- Example Input:

transaction_id	user_id	transaction_date	amount
1024	567	10/01/2022 00:00:00	50.00
2048	123	10/02/2022 00:00:00	75.00
4096	456	10/05/2022 00:00:00	100.00
8192	123	10/10/2022 00:00:00	150.00
16384	567	10/20/2022 00:00:00	50.00

Example Output:

user_id	total_transactions	total_amount
123	2	225.00
567	2	100.00
456	1	100.00

Answer:

This query identifies and returns the top 5 VIP users within the last 30 days. It does this by grouping transactions by , counting the number of transactions, and summing up the total transaction amount per user only for transactions made within the past 30 days. The query then filters to keep only the users with ten or more transactions within the time frame. The result is ordered first by the total number of transactions in descending order, and then by the total transaction amount in descending order (for situations where two users have an equal number of transactions).

To work on another SQL customer analytics question where you can solve it right in the browser and have your SQL code instantly graded, try this Walmart SQL Interview Question:

SQL Question 2: Analyze Server Usage Data

You're given a table representing the usage data of Chindata's servers. Each row of the table represents a single day of usage for a given server. The table contains the following columns:

: The unique identifier of a server
: The date of data collection
: The CPU usage in percentage
: The RAM usage in percentage

With this data, write a SQL query to find the average CPU usage and RAM usage of each server in the previous seven days for each given day.

Example Input:

server_id	usage_date	cpu_usage	ram_usage
S101	2022-05-01	45	63
S102	2022-05-01	46	60
S101	2022-05-02	35	50
S102	2022-05-02	55	75
S101	2022-05-03	40	65
S102	2022-05-03	50	70
S101	2022-05-04	55	75
S102	2022-05-04	45	68

Example Output:

server_id	usage_date	avg_cpu_usage	avg_ram_usage
S101	2022-05-02	40	56.5
S102	2022-05-02	50.5	67.5
S101	2022-05-03	40	59.3
S102	2022-05-03	50.3	68.3
S101	2022-05-04	43.7	63.7
S102	2022-05-04	49	68.3

Answer:

The SQL query above calculates the rolling 7 days average CPU and RAM usages. We must have to use a window function because we're calculating a running average that depends on the current row and the 6 preceding rows. This is accomplished by using the clause in the function. The clause ensures that each server's usage is restarted when calculating the average, to prevent the overflow from one server to another.

p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur

SQL Question 3: What are SQL constraints, and can you give some examples?

Think of SQL constraints like the rules of a game. Just like a game needs rules to keep things fair and fun, a database needs constraints to keep things organized and accurate.

There are several types of SQL constraints like:

NOT NULL: This constraint is like a bouncer at a nightclub - it won't let anything NULL through the door. UNIQUE: This constraint is like a VIP list - only special, one-of-a-kind values get in. PRIMARY KEY: This constraint is like an elected official - it's made up of NOT NULL and UNIQUE values and helps identify each row in the table. FOREIGN KEY: This constraint is like a diplomatic ambassador - it helps establish relationships between tables. CHECK: This constraint is like a referee - it makes sure everything follows the rules. DEFAULT: This constraint is like a backup plan - it provides a default value if no other value is specified.

So, whether you're playing a game or organizing a database, constraints are an important part of the process!

Chindata SQL Interview Questions

SQL Question 4: Chindata Usage Analysis

Chindata is a company that provides cloud storage services to its clients. As their cloud storage utilization is growing, they need to understand the usage patterns to offer better and efficient services. The company wants an SQL query that can determine the average usage and the maximum usage each month for every client, so they can plan for future upgrades and efficiently assign resources.

They maintain two main tables, and , representing the information about their clients and their storage usage respectively.

Example Input:

client_id	name	client_since
1	Client A	01/01/2022
2	Client B	03/25/2022
3	Client C	05/15/2022

Example Input:

usage_id	client_id	usage_date	usage_gb
1	1	06/01/2022	200
2	1	06/15/2022	215
3	1	07/01/2022	220
4	2	06/25/2022	150
5	2	07/01/2022	155
6	3	05/15/2022	100
7	3	06/01/2022	120
8	3	07/01/2022	130
9	3	07/15/2022	140

Answer:

This query joins the and tables on the field and extracts from the . It further groups data by and and computes the average and maximum usage for each group. The function truncates the date to month, and and functions are used to calculate the average and maximum usage respectively.

SQL Question 5: What does database normalization mean?

Normalization involves breaking up your tables into smaller, more specialized ones and using primary and foreign keys to define relationships between them. Not only does this make your database more flexible and scalable, it also makes it easier to maintain. Plus, normalization helps to keep your data accurate by reducing the chance of inconsistencies and errors.

The only downside is now is that your queries will involve more joins, which are slow AF and often a DB performance botteleneck.

SQL Question 6: Find the Average CPU Usage

As the data analyst of Chindata, your task is to track the performance of company servers. Given a table named that contains server data logged every minute, write an SQL query to find the average CPU usage by server, sorted by the server with the highest average usage.

Example Input:

log_id	server_id	log_time	cpu_usage
101	A1	01/10/2022 00:01:00	5%
102	A1	01/10/2022 00:02:00	10%
103	A2	01/10/2022 00:02:00	15%
104	B1	01/10/2022 00:03:00	20%
105	A1	01/10/2022 00:03:00	10%
106	B1	01/10/2022 00:04:00	15%

Note: CPU usage is a percentage string.

Example Output:

server_id	avg_cpu_usage
B1	17.5%
A2	15.0%
A1	8.33%

Answer:

The AVG function is used here to find the average CPU usage per server. The REPLACE and CAST functions are used to transform the cpu_usage field into a numeric field because it is originally a string field containing a percentage sign (%). The TO_CHAR function is used to convert the resulting average back into a percentage string format in the result set. The server_id results are then ordered in descending order according to their average CPU usage.

To practice a very similar question try this interactive Amazon Server Utilization Time Question which is similar for calculating server time usage or this Google Odd and Even Measurements Question which is similar for working with different types of server data.

SQL Question 7: How is the constraint used in a database?

A is a column or set of columns in a table that references the primary key of another table. It is used to link the two tables together, and to ensure that the data in the foreign key column is valid.

The constraint helps to ensure the integrity of the data in the database by preventing the insertion of rows in the foreign key table that do not have corresponding entries in the primary key table. It also helps to enforce the relationship between the two tables, and can be used to ensure that data is not deleted from the primary key table if there are still references to it in the foreign key table.

For example, if you have a table of Chindata customers and an orders table, the customer_id column in the orders table could be a that references the id column (which is the primary key) in the Chindata customers table.

SQL Question 8: Click-Through-Rate for Chindata

Recently, Chindata has launched several digital ads and they want to analyze their click-through rates (CTR). The CTR is calculated as the number of users who clicked on the ad divided by the number of total impressions (the number of times the ad was shown). Assume you have access to two tables.

The first one, named , contains data for every time an ad was shown to a user. The second one, named , contains data for every time a user clicked on an ad.

Example Input:

impression_id	user_id	time	ad_id
6751	823	06/08/2021 00:00:00	403
8752	432	06/10/2021 00:00:00	403
4543	678	06/18/2021 00:00:00	612
5872	995	07/26/2021 00:00:00	612
6437	900	07/05/2021 00:00:00	522

Example Input:

click_id	user_id	time	ad_id
4932	823	06/08/2021 00:00:01	403
2658	543	06/10/2021 00:00:01	612
7532	995	07/26/2021 00:00:01	612
4317	797	07/05/2021 00:00:01	403
3214	721	07/10/2021 00:00:01	522

Using the above two tables, write a SQL query that provides the CTR per advertisement.

Answer:

This query uses a LEFT JOIN to combine the impressions and clicks tables on ad_id and user_id. As we're interested in the CTR, it's necessary to count the number of unique users who clicked on the ad and divide this by the total number of unique impressions . The result is the CTR for each ad.

To practice a related SQL interview question on DataLemur's free online SQL coding environment, solve this Meta SQL interview question:

SQL Question 9: Calculate the Monthly Bandwidth Usage

Assume Chindata is a data center providing hosting services. At the end of each day, the system automatically records the amount of data transferred by each hosted server. Your goal is to write a SQL query that calculates the total amount of data transferred for each server on a monthly basis.

The table has the following columns:

-- a unique identifier for each usage record (int)
-- the identifier of the server (int)
-- the date when the data usage was recorded (datetime)
-- the amount of data transferred in gigabytes (int)

Example Input

usage_id	server_id	date	data_gb
101	1	06/01/2022 00:00:00	5
102	1	06/02/2022 00:00:00	7
103	2	06/02/2022 00:00:00	2
104	1	07/01/2022 00:00:00	10
105	2	07/01/2022 00:00:00	15

Example Output

month	server_id	total_data_gb
6	1	12
6	2	2
7	1	10
7	2	15

Answer:

This above query first extracts the month part from the date using the DATE_PART function. It then groups the rows by the month and server_id and calculates the sum of data_gb for each group. The output is ordered by the month and server_id for readability.

SQL Question 10: When it comes to database normalization, what's the difference between 1NF, 2NF, and 3NF?

There are several normal forms that define the rules for normalizing a database:

A database is in first normal form (1NF) if it meets the following criteria:

Each column in a table contains a single value (no lists or containers of data)
Each column should contain the same type of data (no mixing strings vs. integers)
Each row in the table is unique

A database is in second normal form (2NF) if it meets the following criteria:

It is in first normal form.
All non-key attributes in a table are fully dependent on the primary key.

Said another way, to achieve 2NF, besides following all the rules from 1NF all the columns in a given table should be dependent only on that table's primary key.

A database is in third normal form (3NF) if it meets the following criteria:

It is in second normal form.
There are no transitive dependencies in the table.

A transitive dependency means that a piece of data in one column is derived from another column. For example, it wouldn't make sense to keep a column called "user's age" and "user's birthdate" (because age can be derived from birthdate.

While there's also a 4th and 5th normal form, it's too pedantic and hence out-of-scope to know for the Chindata SQL interview.

SQL Question 11: Calculate the Average Power Consumption per Server Rack by Region

Chindata has several data centers distributed across various regions. Each data center consists of several server racks. Each rack contains several servers, and the power consumption of each varies. Chindata wants to calculate the average power consumption per rack for each region.

Below are the data structures for the , , and .

Input:

dc_id	region
101	East
102	West
103	North
104	South

Input:

rack_id	dc_id
201	101
202	101
203	102
204	103
205	104

Input:

server_id	rack_id	power_consumption
301	201	150
302	201	200
303	202	250
304	203	350
305	204	120
306	205	360

Example Output:

region	avg_power_per_rack
East	200
West	350
North	120
South	360

Answer:

This query joins the , , and tables on their respective foreign keys. It then groups the result by , and generates an average power consumption for each rack in each region using PostgreSQL's and functions. The function is used to round the result to the nearest whole number.

The two most similar questions to your query are:

"Server Utilization Time" by Amazon This question's similarity lies in the requirement to calculate a specific metric (total time) for server utilization.
"Supercloud Customer" by Microsoft This question is similar as it also involves working with different server racks, only in this question, it's about identifying companies instead of calculating average power consumption.

So the markdown-friendly output you requested is:

To practice a very similar question try this interactive Amazon Server Utilization Time Question which is similar for calculating server related metrics or this Microsoft Supercloud Customer Question which is similar for dealing with server racks.

Chindata SQL Interview Tips

The best way to prepare for a Chindata SQL interview is to practice, practice, practice. Beyond just solving the above Chindata SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Netflix, Google, and Amazon.

Each SQL question has hints to guide you, step-by-step solutions and best of all, there is an interactive coding environment so you can right in the browser run your query and have it executed.

To prep for the Chindata SQL interview it is also wise to practice interview questions from other tech companies like:

But if your SQL skills are weak, don't worry about going right into solving questions – go learn SQL with this DataLemur SQL Tutorial.

This tutorial covers SQL concepts such as window functions and CASE/WHEN statements – both of these come up routinely in SQL job interviews at Chindata.

Chindata Data Science Interview Tips

What Do Chindata Data Science Interviews Cover?

In addition to SQL query questions, the other question categories to prepare for the Chindata Data Science Interview are:

Statistics Interview Questions
Python or R Programming Questions
Open-Ended Data Case Studies
Machine Learning Questions
Behavioral Based Interview Questions

Chindata Data Scientist

How To Prepare for Chindata Data Science Interviews?

The best way to prepare for Chindata Data Science interviews is by reading Ace the Data Science Interview. The book's got:

201 Interview Questions from tech companies like Netflix, Google, & Airbnb
A Crash Course covering Product Analytics, SQL & ML
Great Reviews (900+ 5-star reviews on Amazon)

11 Chindata SQL Interview Questions (Updated 2025)

11 Chindata SQL Interview Questions

SQL Question 1: Identifying Power Users

- Example Input:

Example Output:

Answer:

SQL Question 2: Analyze Server Usage Data

Example Input:

Example Output:

Answer:

SQL Question 3: What are SQL constraints, and can you give some examples?

SQL Question 4: Chindata Usage Analysis

Example Input:

Example Input:

Answer:

SQL Question 5: What does database normalization mean?

SQL Question 6: Find the Average CPU Usage

Example Input:

Example Output:

Answer:

SQL Question 7: How is the constraint used in a database?

SQL Question 8: Click-Through-Rate for Chindata

Example Input:

Example Input:

Answer:

SQL Question 9: Calculate the Monthly Bandwidth Usage

Example Input

Example Output

Answer:

SQL Question 10: When it comes to database normalization, what's the difference between 1NF, 2NF, and 3NF?

SQL Question 11: Calculate the Average Power Consumption per Server Rack by Region

Input:

Input:

Input:

Example Output:

Answer:

Chindata SQL Interview Tips

Chindata Data Science Interview Tips

What Do Chindata Data Science Interviews Cover?

How To Prepare for Chindata Data Science Interviews?

Career Resources

Support

Interview Questions