11 DiDi SQL Interview Questions (Updated 2025)

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

April 28, 2025

At DiDi (aka the Chinese version of Uber), SQL is used quite frequently for analyzing traffic data for route optimization and analyzing customer data for personalized marketing campaigns. Because of this, DiDi asks SQL problems in interviews for Data Analyst, Data Science, and BI jobs.

As such, to help you prepare for the DiDi SQL interview, this blog covers 11 DiDi SQL interview questions – able to answer them all?

11 DiDi SQL Interview Questions

SQL Question 1: Identify and Analyze DiDi's Power Users

The ride-hailing company DiDi has a database which tracks rides taken by its users. Your task is to identify the "power users". For this exercise, a "power user" is defined as a user who has completed over 20 rides in a month. Write a SQL query to identify these users and calculate the total kilometres and average cost of their rides for every month in 2022.

Your database has two tables: users and rides. The table has a one-to-many relationship with the table (each user can have multiple rides, but each ride belongs to only one user).

Sample data for the table:

Answer:

This problem can be solved by using a combination of aggregation, filtering and joining operations on the and tables. Here's one possible way to write the SQL query in PostgreSQL format:

This query first filters the table to include only rides that took place in 2022. It then groups the results by user and month, and for each user and month, it calculates the number of rides taken, the total kilometres traveled and the average ride cost. Finally, it filters these groups to return only those where the number of rides taken is more than 20, and orders the results by user and month.

To practice a similar customer analytics SQL question where you can solve it interactively and have your SQL query automatically checked, try this Walmart SQL Interview Question:

SQL Question 2: Analyzing Daily User Ride Duration

Consider a ride-sharing company like DiDi where you have a record of completed trips data. Your task is to write a SQL query that calculates the running seven days average of ride durations for each rider_id.

The company stores data on trip durations in the table, which has the following columns:

Example Input:

trip_id	rider_id	trip_date	duration_minutes
1	001	2022-07-01	15
2	002	2022-07-01	20
3	001	2022-07-02	30
4	002	2022-07-02	25
5	001	2022-07-03	20
6	002	2022-07-03	30
7	001	2022-07-04	25

The output columns should include , and .

Example Output:

rider_id	trip_date	7_days_avg_duration
001	2022-07-01	15.00
002	2022-07-01	20.00
001	2022-07-02	22.50
002	2022-07-02	22.50
001	2022-07-03	21.67
002	2022-07-03	25.00
001	2022-07-04	22.50

Answer:

You should use the window function with the window defined to be trailing seven days (considering current row) using . This gives the running seven days average.

This query first partitions the data by and then sorts it by within each partition. It evaluates the average for each row with the help of a window frame that includes the current row and the six preceding rows. For the first six rows of each , the average is computed only for the available rows.

To practice a related window function SQL problem on DataLemur's free interactive SQL code editor, try this Google SQL Interview Question:

SQL Question 3: How can you identify duplicates in a table? Do have any other approaches?

One way to find duplicate records in a table is by using , and then seeing which groups have more than one occurence:

Another way is by using the operator:

DiDi SQL Interview Questions

SQL Question 4: Analyzing Click-through and Conversion Rates

DiDi, a ride-hailing service, has launched a new promotional campaign. They want to measure the campaign's performance by analyzing the Click-through and Conversion rates. A user first views a promotional ad (Click-through) and then books a ride (Conversion). Here are the sample datasets.

Example Input:

ad_id	user_id	click_date
101	123	06/08/2022
102	265	06/10/2022
101	362	06/18/2022
103	192	07/26/2022
102	981	07/05/2022

Example Input:

booking_id	user_id	ad_id	booking_date
8001	123	101	06/09/2022
8050	265	102	06/11/2022
8502	362	101	06/19/2022
8259	192	null	07/26/2022
8901	981	102	07/06/2022

Answer:

This query performs a left join of the table (a) and the table (b) on both and columns. Then, for each ad (), it calculates the Conversion Rate as the ratio of distinct users who booked a ride to the distinct users who clicked the ad. If none of the users clicked the ad (denominator is 0), prevent division by zero, returning .

To practice a similar problem about calculating rates, try this TikTok SQL question within DataLemur's online SQL code editor:

SQL Question 5: What does adding 'DISTINCT' to a SQL query do?

If you want to return records with no duplicates, you can use the keyword in your statement.

For example, if you had a table of DiDi employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:

If had the following data:

f_name	job_title
Akash	Data Analyst
Brittany	Data Scientist
Carlos	Data Engineer
Diego	Data Engineer
Eva	Data Analyst

Then the output from the query would be:

job_title
Data Analyst
Data Scientist
Data Engineer

SQL Question 6: Average Ride Duration for Each Car Type in DiDi

Imagine you work at DiDi, the company that provides ride-hailing services. You are asked to analyze the ride durations for each car type to understand the product usage and customer experience. Given the rides data, how would you find out what is the average ride duration for each car type?

Here's an example of how the table is structured:

Example Input:

ride_id	user_id	ride_date	car_type	duration_minutes
2001	123	06/08/2022	Luxury	30
2002	265	06/10/2022	Standard	45
2003	362	06/18/2022	Luxury	60
2004	192	07/26/2022	Standard	20
2005	981	07/05/2022	Luxury	45

Write a SQL query that will return the average ride duration for each car type.

Example Output:

car_type	avg_duration_minutes
Luxury	45.00
Standard	32.50

Answer:

The SQL query uses the statement coupled with the function to find the average duration of rides for each car type. It pulls data from the column and groups them by the values in the column. It then calculates the average duration for each group.

SQL Question 7: What does it mean to normalize a database?

Normalization involves breaking up your tables into smaller, more specialized ones and using primary and foreign keys to define relationships between them. Not only does this make your database more flexible and scalable, it also makes it easier to maintain. Plus, normalization helps to keep your data accurate by reducing the chance of inconsistencies and errors.

The only downside is now is that your queries will involve more joins, which are slow AF and often a DB performance botteleneck.

SQL Question 8: Filter Customer Records Based on Specific Pattern

DiDi is a major player in the ride-hailing market. They have a huge customer database consisting of customer names, email addresses and their ride details. There is a requirement to find the details of customers whose email addresses belong to a particular domain, say 'gmail.com'.

We need to filter the records of DiDi's customers present in their "customers" database that have 'gmail.com' in their email addresses.

Example Input:

customer_id	name	email	ride_count	registration_date
101	John Doe	john.doe@gmail.com	15	01/10/2019
102	Jane Smith	jane.smith@yahoo.com	22	15/04/2017
103	Alicia Brown	aliciabrown@gmail.com	30	20/11/2018
104	Michael Black	michaelblack@hotmail.com	45	01/01/2020
105	Samuel Green	samuel.green@yahoo.com	35	05/06/2018
106	Anna White	annawhite@gmail.com	18	01/12/2019

Example Output:

customer_id	name	email	ride_count	registration_date
101	John Doe	john.doe@gmail.com	15	01/10/2019
103	Alicia Brown	aliciabrown@gmail.com	30	20/11/2018
106	Anna White	annawhite@gmail.com	18	01/12/2019

Answer:

In this PostgreSQL query, we are making use of the SQL 'LIKE' keyword to filter information from the 'customers' table. The "%" before and after 'gmail.com' in the 'LIKE' clause are wildcards that match any sequence of characters. So, the query is searching for any data entry in the 'email' column that contains 'gmail.com'. The output is a table listing the details of all customers whose email addresses end with 'gmail.com'.

SQL Question 9: Analyzing Customer Ride Data and Payment Methods

As a data analyst at DiDi, you are tasked to analyze the customer ride data. You need to find out which payment method is most popular among customers for each ride type. Write a SQL query to solve this problem.

You are provided with two tables:

table:

ride_id	customer_id	ride_type	start_time
1	2	Economy	2021-05-09 08:00:00
2	3	Luxury	2021-05-09 09:00:00
3	2	Economy	2021-05-10 10:00:00
4	1	Luxury	2021-05-11 11:00:00

table:

payment_id	ride_id	payment_type	payment_amount
1	1	Credit Card	30.00
2	2	WeChat	150.00
3	4	AliPay	200.00
4	3	WeChat	35.00

Answer:

In PostgreSQL, you could use a subquery to first find out the count of each payment type per ride type. The 'rank' function can then be used to select the most popular payment method for each ride type.

This query will return the most popular payment method for each type of ride based on the number of times each payment method was used for each ride type. If more than one payment method shares the top spot for a certain ride type, all will be included in the output.

Since joins come up frequently during SQL interviews, try this interactive Snapchat JOIN SQL interview question:

SQL Question 10: Can you describe the meaning of a constraint in SQL in layman's terms?

SQL constraints are used to specify rules for the data in a table. They can be applied to single or multiple fields in a table when the table is created, or after the table has been created using the ALTER TABLE command.

For example, say you had a database that stores ad campaign data from DiDi's Google Analytics account.

Here's what some constraints could look like:

SQL Question 11: Mathematical Trip Data Analysis

Based on the trip data from the DiDi app, calculate the average trip distance from each starting location, round the result to 2 decimal places, and find the absolute difference between the maximum and minimum average trip distance. Also, find the square root of the total number of trips made from each starting location. Assume that the starting location is stored as a string.

Example Input:

trip_id	start_location	end_location	trip_distance (km)
1	'New York'	'Los Angeles'	500
2	'New York'	'San Francisco'	480
3	'San Francisco'	'Los Angeles'	385
4	'San Francisco'	'New York'	520
5	'Los Angeles'	'New York'	515

Example Output:

start_location	average_distance	distance_difference	sqrt_of_total_trips
'New York'	490.00	35.00	14.14
'San Francisco'	452.50	67.50	14.14
'Los Angeles'	515.00	0.00	1.00

Answer:

In the provided solution, a common table expression is created calculating the average trip distance and the total number of trips group by . Another CTE calculates the absolute difference of the maximum and minimum average distance for each .

Finally, the necessary information is selected from these CTEs and the square root of the total number of trips is also calculated.

To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for requiring calculations with aggregate functions or this Alibaba Compressed Mean Question which is similar for providing statistics and rounding values to a specific decimal place.

Preparing For The DiDi SQL Interview

The key to acing a DiDi SQL interview is to practice, practice, and then practice some more! Beyond just solving the earlier DiDi SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Microsoft, Google, and Facebook.

Each interview question has multiple hints, full answers and most importantly, there's an interactive coding environment so you can right online code up your SQL query and have it graded.

To prep for the DiDi SQL interview it is also helpful to practice SQL problems from other tech companies like:

However, if your SQL coding skills are weak, forget about jumping right into solving questions – improve your SQL foundations with this SQL interview tutorial.

This tutorial covers topics including window functions and removing NULLs – both of these pop up often during DiDi SQL assessments.

DiDi Data Science Interview Tips

What Do DiDi Data Science Interviews Cover?

In addition to SQL interview questions, the other question categories covered in the DiDi Data Science Interview are:

Probability & Stats Questions
Coding Questions in Python or R
Analytics and Product-Metrics Questions
Machine Learning and Predictive Modeling Questions
Behavioral Based Interview Questions

DiDi Data Scientist

How To Prepare for DiDi Data Science Interviews?

The best way to prepare for DiDi Data Science interviews is by reading Ace the Data Science Interview. The book's got:

201 Interview Questions from Facebook, Google, & Amazon
A Refresher on Stats, SQL & ML
Great Reviews (900+ 5-star reviews on Amazon)

11 DiDi SQL Interview Questions (Updated 2025)

11 DiDi SQL Interview Questions

SQL Question 1: Identify and Analyze DiDi's Power Users

Answer:

SQL Question 2: Analyzing Daily User Ride Duration

Example Input:

Example Output:

Answer:

SQL Question 3: How can you identify duplicates in a table? Do have any other approaches?

SQL Question 4: Analyzing Click-through and Conversion Rates

Example Input:

Example Input:

Answer:

SQL Question 5: What does adding 'DISTINCT' to a SQL query do?

SQL Question 6: Average Ride Duration for Each Car Type in DiDi

Example Input:

Example Output:

Answer:

SQL Question 7: What does it mean to normalize a database?

SQL Question 8: Filter Customer Records Based on Specific Pattern

Example Input:

Example Output:

Answer:

SQL Question 9: Analyzing Customer Ride Data and Payment Methods

Answer:

SQL Question 10: Can you describe the meaning of a constraint in SQL in layman's terms?

SQL Question 11: Mathematical Trip Data Analysis

Example Input:

Example Output:

Answer:

Preparing For The DiDi SQL Interview

DiDi Data Science Interview Tips

What Do DiDi Data Science Interviews Cover?

How To Prepare for DiDi Data Science Interviews?

Career Resources

Support

Interview Questions