9 Baidu SQL Interview Questions (Updated 2025)

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

February 28, 2025

At Baidu, SQL is used day-to-day for analyzing user behavior patterns to fine-tune search algorithms, and managing large datasets to power the Baidu advertisement unit. Unsurprisingly this is why Baidu LOVES to ask SQL query questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.

So, to help you practice for the Baidu SQL interview, this blog covers 9 Baidu SQL interview questions – can you solve them?

9 Baidu SQL Interview Questions

SQL Question 1: Find the Top Active Users on Baidu

Baidu, being a web services company, greatly values its users. Power users, or those who frequently engage in the company's various activities, are of particular importance. For this assessment, assume that a "power user" is defined as a user who performs a significant number of searches monthly.

You are provided the following table with the following columns:

: a unique identifier for each search
: the ID for the user who performed the search
: the date the search was performed with the format 'YYYY-MM-DD'

Please write a SQL query to find the top 5 users who performed the highest number of searches in the past month and provide both their and the number of searches they performed.

Example - Input:

search_id	user_id	search_date
1001	2345	2023-12-10
1002	9876	2023-12-15
1003	2345	2023-12-16
1004	1111	2022-06-01
1005	9876	2023-12-17
1006	2345	2023-12-18
1007	1111	2023-12-19
1008	1111	2023-12-20

(Note: This is a simplified table and does not include other potential columns such as or , which are irrelevant for this question)

Answer:

This query calculates the number of searches each user performed in the past month by counting the number of unique 's each performed. This result is then ordered in descending order and limited to the 5 users with the most searches. As such, the output will be a list of the top 5 users and the number of searches they performed in the last month.

To solve a related customer analytics question on DataLemur's free online SQL coding environment, try this Microsoft Azure Cloud SQL Interview Question:

SQL Question 2: Analysis of Users' Search History

As a data analyst at Baidu, you have been tasked with performing analysis on users' search history data. You have been asked to find out the most popular search terms per user for every month.

The database has a table with following structure.

Example Input:

search_id	user_id	search_date	search_term
101	123	04/02/2022	machine learning
102	234	04/03/2022	AI
103	123	04/04/2022	machine learning
104	234	04/05/2022	big data
105	456	04/06/2022	blockchain
106	123	04/07/2022	machine learning
107	123	05/02/2022	data science
108	234	05/03/2022	AI
109	123	05/04/2022	data science

In the table, is a unique identifier of the search, is the identifier of the search's user, is the date when the search was performed, and is the term that was searched.

Write a SQL query to find out the most popular search terms per user for each month. If there are multiple search terms with the same frequency, choose the most recent one.

Example Output:

month	user_id	popular_search_term
4	123	machine learning
4	234	AI
5	123	data science

Answer:

The following PostgreSQL query can solve this problem:

This query calculates the frequency of each search term per user per month with a window function in the CTE . Then it selects the most popular (most frequent) search term per user per month. If there are multiple search terms with the same frequency, the most recent one is selected due to in the clause.

For more window function practice, try this Uber SQL problem on DataLemur's online SQL coding environment:

SQL Question 3: What's the difference between a foreign and primary key?

To explain the difference between a primary key and foreign key, let's start with some data from Baidu's marketing analytics database which stores data from Google Ads campaigns:

+------------+------------+------------+------------+ | ad_id | campaign_id| keyword | click_count| +------------+------------+------------+------------+ | 1 | 100 | Baidu pricing | 10 | | 2 | 100 | Baidu reviews | 15 | | 3 | 101 | Baidu alternatives | 7 | | 4 | 101 | buy Baidu | 12 | +------------+------------+------------+------------+

In this table, could be a primary key. It is unique for each row in the table and cannot contain null values.

could be a foreign key. It references the of the Google Ads campaign that each ad belongs to, establishing a relationship between the ads and their campaigns. This foreign key allows you to easily query the table to find out which ads belong to a specific campaign, or to find out which campaigns a specific ad belongs to.

It is also possible for a table to have multiple foreign keys that reference different primary keys in other tables. For example, the table could have additional foreign keys for the of the ad group that each ad belongs to, and the of the Google Ads account that the campaigns belong to.

Baidu SQL Interview Questions

SQL Question 4: User Information and Search Queries Analysis

As a data analyst at Baidu, one of your tasks is to understand the search behavior of different groups of users. The system has gathered user information, including their registration dates and regions. Additionally, the system logs all search queries made by these users along with timestamp. The aim is to find out the number of unique users in every region who have made at least one search query each month.

Baidu's user information can be stored in a table and the search activity in a table:

Example Input:

user_id	registration_date	region
4521	2019-11-27	East Asia
3857	2020-06-12	South Asia
2953	2019-12-15	Europe
4189	2021-05-22	North America
1234	2020-02-18	South America

Example Input:

user_id	search_date	query
4521	2022-06-10	Python
3857	2022-07-01	Big Data
3857	2022-07-05	Machine Learning
2953	2022-06-30	Artificial Intelligence
1234	2022-07-11	Data Science
4189	2022-08-05	Java
4521	2022-08-12	JavaScript

Answer:

You can use the following SQL query in PostgreSQL to solve the problem:

This query joins and tables on , then groups by each month and region. Within each group, it counts the number of distinct to determine the monthly active users per region. The function truncates the search date to month, and the casting is there to convert the timestamp (with time and possibly timezone) to a simple date format.

SQL Question 5: What's the SQL command do, and can you give an example?

combines the results from two or more SELECT statements, but only returns the rows that are exactly the same between the two sets.

For example, let's use to find all of Baidu's Facebook video ads with more than 10k views that are also being run on YouTube:

SQL Question 6: Average Number of Clicked Ads Per Day in Baidu

Baidu is a Chinese multinational technology company specializing in Internet-related services and products, and artificial intelligence. One valuable metric for them might be the average number of clicked ads per day. The following tables represent simplified versions of Baidu's , , and databases:

Example Input:

user_id	signup_date	region
123	01/08/2022	South China
265	11/12/2021	North China
362	25/07/2022	East China
192	10/08/2021	West China
981	15/01/2022	Central China

Example Input:

ad_id	company_id	post_date
1001	3000	05/08/2022
2100	4500	15/11/2021
1200	5500	06/10/2022
8900	6500	18/08/2022
6000	7500	19/07/2022

Example Input:

click_id	click_date	user_id	ad_id
5253	26/08/2022	123	2100
7202	28/08/2022	192	8900
3946	30/08/2022	362	6000
3245	28/08/2022	265	1001
9898	29/08/2022	981	2100

Using these tables, write a SQL query to find the average number of ads clicked per day for all users.

Example Output:

day	avg_clicks
26/08/2022	1.00
28/08/2022	2.00
29/08/2022	1.00
30/08/2022	1.00

Answer:

The provided SQL query first groups the clicks per day, calculating the number of clicks each day. It then calculates the average number of clicks per day by using a window function that orders the data by click date. Finally, it orders the resulting output by click date.

To practice a very similar question try this interactive Facebook App Click-through Rate (CTR) Question which is similar for calculating click rates or this Facebook Active User Retention Question which is similar for handling user data.

SQL Question 7: What's a database view, and when would you use one?

Database views are virtual tables based on the results of a SQL statement. They're just like vanilla tables, except views allow you to create simplified versions of tables or hide sensitive data from certain users.

In PostgreSQL, you can create a view by using the command. Here's an example for the table:

SQL Question 8: Average Search Query Duration

You are a data analyst at Baidu, a popular search engine company in China. The development team needs to check if some search categories are slowing down. Your task is to find the average duration of search queries for each search category over the last month.

Please consider the following "searches" table for this analysis:

Example Input:

search_id	user_id	search_date	search_category	search_duration
101	456	08/10/2022 10:15:00	Tech	0.15
102	789	08/12/2022 11:25:00	Fashion	0.18
103	456	08/20/2022 13:33:00	Travel	0.2
104	321	08/21/2022 16:00:00	Tech	0.12
105	123	08/23/2022 17:45:00	Fashion	0.16

Your task is to provide a report that provides the average search duration by category for the last month. The output should include the month (in YYYY-MM format), search category, and average search duration up to two decimal places.

Example Output:

mth	category	avg_duration
2022-08	Tech	0.14
2022-08	Fashion	0.17
2022-08	Travel	0.20

Answer:

This SQL query rounds the average search duration to two decimal places for each search category. It runs for the last month, considering the current date as the end point. For instance, if the current date is September 15, then it will consider the period from August 1 to August 31. The results are grouped by month and search category.

SQL Question 9: Baidu Web Traffic Analysis

As a data analyst at Baidu, you've been given a dataset of user website visit logs. The user visits are marked with a timestamp. Your task is to generate a report that shows the number of daily unique visitors for the past week.

The table has the following structure:

Example Input:

log_id	user_id	visit_time
1	123	2022-07-29 08:30:00
2	123	2022-07-29 10:20:00
3	265	2022-07-28 15:42:00
4	362	2022-07-28 22:12:00
5	192	2022-07-27 16:45:00
6	192	2022-07-27 18:55:00
7	981	2022-07-26 13:22:00
8	981	2022-07-31 08:44:00
9	525	2022-07-25 05:40:00
10	525	2022-07-25 06:10:00
11	264	2022-07-24 14:42:00
12	123	2022-07-24 15:22:00

We would like to produce the following result:

Example Output:

visit_date	unique_visitors
2022-07-24	2
2022-07-25	1
2022-07-26	1
2022-07-27	1
2022-07-28	2
2022-07-29	1
2022-07-30	0
2022-07-31	1

Answer:

A PostgreSQL query that can be used to answer this question is:

This query groups the logs by the date of visit (ignoring the time part) for the past 7 days. It counts the number of unique user IDs for each group, which gives us the number of unique visitors each day. The result is then ordered by the visit date in ascending order.

Preparing For The Baidu SQL Interview

The best way to prepare for a Baidu SQL interview is to practice, practice, practice. In addition to solving the above Baidu SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like FAANG and tech startups.

Each DataLemur SQL question has multiple hints, step-by-step solutions and most importantly, there is an interactive coding environment so you can right online code up your SQL query and have it graded.

To prep for the Baidu SQL interview it is also wise to solve SQL problems from other tech companies like:

In case your SQL coding skills are weak, don't worry about going right into solving questions – go learn SQL with this SQL tutorial for Data Scientists & Analysts.

This tutorial covers topics including math functions like ROUND()/CEIL() and AND/OR/NOT – both of these pop up often in SQL interviews at Baidu.

Baidu Data Science Interview Tips

What Do Baidu Data Science Interviews Cover?

Besides SQL interview questions, the other question categories covered in the Baidu Data Science Interview are:

Statistics and AB Testing Questions
Python Pandas or R Coding Questions
Product Analytics Questions
Machine Learning Questions
Behavioral Based Interview Questions

Baidu Data Scientist

How To Prepare for Baidu Data Science Interviews?

To prepare for Baidu Data Science interviews read the book Ace the Data Science Interview because it's got:

201 interview questions sourced from Microsoft, Amazon & startups
a refresher covering Product Analytics, SQL & ML
over 900+ 5-star reviews on Amazon

9 Baidu SQL Interview Questions (Updated 2025)

9 Baidu SQL Interview Questions

SQL Question 1: Find the Top Active Users on Baidu

Example - Input:

Answer:

SQL Question 2: Analysis of Users' Search History

Example Input:

Example Output:

Answer:

SQL Question 3: What's the difference between a foreign and primary key?

SQL Question 4: User Information and Search Queries Analysis

Example Input:

Example Input:

Answer:

SQL Question 5: What's the SQL command do, and can you give an example?

SQL Question 6: Average Number of Clicked Ads Per Day in Baidu

Example Input:

Example Input:

Example Input:

Example Output:

Answer:

SQL Question 7: What's a database view, and when would you use one?

SQL Question 8: Average Search Query Duration

Example Input:

Example Output:

Answer:

SQL Question 9: Baidu Web Traffic Analysis

Example Input:

Example Output:

Answer:

Preparing For The Baidu SQL Interview

Baidu Data Science Interview Tips

What Do Baidu Data Science Interviews Cover?

How To Prepare for Baidu Data Science Interviews?

Career Resources

Support

Interview Questions