At MakeMyTrip, SQL is used all the damn time for analyzing customer booking patterns for improving product recommendations, and managing large-scale, dynamic travel databases for efficient data retrieval. That's the reason behind why MakeMyTrip almost always asks SQL coding questions during interviews for Data Science, Data Engineering and Data Analytics jobs.
Thus, to help you prepare for the MakeMyTrip SQL interview, we've curated 8 MakeMyTrip SQL interview questions – can you answer each one?
Given the and tables below, write an SQL query to find out the top 3 most popular destinations for each month (based on the number of reviews), and their average ratings. A destination is considered popular if it receives many reviews.
This query first creates a CTE (Common Table Expression) called which groups our reviews by month and destination, calculating the count of reviews () and the average rating (). Then, we create another CTE called where we rank our destinations within each month based on the number of reviews. Finally, we select from our CTE where rank is less than or equal to 3, joining in the table to get the destination names.
For more window function practice, try this Uber SQL problem within DataLemur's online SQL code editor:
As a data engineer at MakeMyTrip, your task is to design a database to help management understand the booking pattern of their customers. The company needs a way to know the most preferred room type and the most frequently visited cities by the customers.
Design a database for 'trips' and 'hotel_rooms'. 'trips' tables will have columns such as 'trip_id', 'user_id', 'city_id', 'hotel_id', 'room_id', 'check_in_date' and 'check_out_date'. 'hotel_rooms' tables will have columns like 'room_id', 'hotel_id', 'room_type' and 'price_per_night'.
Each row in the 'trips' table represents a trip made by a user, and each row in the 'hotel_rooms' table represents a type of room in a hotel.
trip_id | user_id | city_id | hotel_id | room_id | check_in_date | check_out_date |
---|---|---|---|---|---|---|
101 | 1056 | 3 | 152 | 246 | 01/02/2023 | 01/10/2023 |
102 | 2031 | 1 | 109 | 235 | 01/05/2023 | 01/10/2023 |
103 | 3210 | 3 | 152 | 354 | 01/06/2023 | 01/20/2023 |
104 | 1056 | 2 | 207 | 246 | 01/11/2023 | 01/15/2023 |
room_id | hotel_id | room_type | price_per_night |
---|---|---|---|
234 | 152 | 'premium' | 160 |
235 | 109 | 'economy' | 100 |
246 | 207 | 'deluxe' | 200 |
354 | 152 | 'premium' | 160 |
Write a SQL query to find out what was the most preferred room type among customers who visited city with id=3 in January 2023.
This question can be answered using PostgreSQL, Here is an example of how to do it:
This query joins the and tables on the field and then selects those rows with the specified city id and having check in and check out dates within January 2023. It then groups the results by and counts the number of each type. It orders by the count in descending order and finally limits the output to one to give the most preferred room type.
A self-join is a type of join in which a table is joined to itself. To perform a self-join, you need to specify the table name twice in the FROM clause, and give each instance of the table a different alias. You can then join the two instances of the table using a JOIN clause, and use a WHERE clause to specify the relationship between the rows.
Self-joins are the go-to technique for any data analysis that involves pairs of the same thing, like identifying pairs of products that are frequently purchased together like in this Walmart SQL interview question.
For another example, say you were doing an HR analytics project and needed to analyze how much all MakeMyTrip employees in the same department interact with each other. Here's a self-join query you could use to retrieve all pairs of MakeMyTrip employees who work in the same department:
This query returns all pairs of MakeMyTrip employees who work in the same department, and excludes pairs where the employee's id is the same (since this would represent the same MakeMyTrip employee being paired with themselves).
MakeMyTrip wants to target customers who booked for a trip but cancelled it later within a specified range of time. They have two tables: and .
Your task is to write a SQL query to find all the customers who booked a trip for the month of August 2022, but cancelled it within a week. The result should show the customer_id, their booking_date, their cancellation_date and the number of days between the booking and cancellation.
The table has the following structure:
booking_id | customer_id | booking_date |
---|---|---|
101 | 120 | 8/1/2022 |
102 | 121 | 8/5/2022 |
103 | 122 | 8/20/2022 |
104 | 120 | 8/25/2022 |
The table has the following structure:
cancellation_id | booking_id | cancellation_date |
---|---|---|
501 | 101 | 8/2/2022 |
502 | 102 | 8/13/2022 |
503 | 103 | 8/24/2022 |
504 | 104 | 8/26/2022 |
This query joins the and tables on the and filters for bookings which have been made in the month of August 2022. It then checks if the cancellation happened within a week by using the function to get the difference between the cancellation date and the booking date. The extracted number is then compared to 7 to filter for cancellations that happen within 7 days.
First things first, figure out why the query is slow! You can use and commands in PostgreSQL to identify any performance bottlenecks. You might discover that your query is inefficient, or that there are many database writes at the same time you are doing a read, or maybe too many people are concurrently running queries on the same database server.
For Data Analyst and Data Science positions, knowing the ins-and-outs of SQL performance tuning is out-of-scope for the SQL interview round. However, knowing that joins are expensive, and indexes can speed up queries, is generally enough of an answer for MakeMyTrip SQL interviews.
At MakeMyTrip, we work with a variety of airlines and we consistently maintain a range of ticket prices. The task is to find the average ticket price for each airline over the past year.
flight_id | airline | ticket_price | flight_date |
---|---|---|---|
1001 | Air India | 10000 | 01/08/2021 |
1002 | Jet Airways | 12000 | 05/10/2021 |
1003 | Air India | 15000 | 06/09/2021 |
1004 | SpiceJet | 8000 | 07/26/2021 |
1005 | Indigo | 6000 | 07/05/2021 |
airline | avg_ticket_price |
---|---|
Air India | 12500 |
Jet Airways | 12000 |
SpiceJet | 8000 |
Indigo | 6000 |
The following PostgreSQL query will solve the problem:
This answer calculates the average ticket price for each airline. The WHERE clause is used to restrict the flights to those that occurred between January 1, 2021 and December 31, 2021. The GROUP BY statement is used to split the rows into groups by airline. For each group, the AVG() function is used to calculate the average ticket price.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating statistics for each category or this Amazon Average Review Ratings Question which is similar for grouping by a particular field and calculating the average.
A non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query. On the other hand, a correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query.
Here is an example of a non-correlated sub-query:
The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.
Here is an example of a correlated sub-query:
This query selects the and total sales of all MakeMyTrip customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().
Non-correlated sub-queries are typically faster, as they only have to be executed once, wheareas correlated sub-queries are slower since they have to be re-evaluated for each row of the outer query.
Given the below 'bookings' and 'hotels' tables, write a SQL query that returns the average amount paid (rounded to the nearest integer) per hotel in each city.
booking_id | user_id | hotel_id | checkin_date | checkout_date | amount_paid |
---|---|---|---|---|---|
3972 | 543 | 1001 | 06/08/2022 00:00:00 | 06/10/2022 00:00:00 | 10000 |
2351 | 348 | 1002 | 06/10/2022 00:00:00 | 06/12/2022 00:00:00 | 8000 |
8921 | 645 | 2001 | 06/18/2022 00:00:00 | 06/20/2022 00:00:00 | 12000 |
3589 | 234 | 1001 | 07/26/2022 00:00:00 | 07/28/2022 00:00:00 | 9800 |
6471 | 789 | 2002 | 07/05/2022 00:00:00 | 07/07/2022 00:00:00 | 7500 |
hotel_id | city |
---|---|
1001 | Delhi |
1002 | Delhi |
2001 | Mumbai |
2002 | Mumbai |
This query links the 'bookings' and 'hotels' tables through an inner join on matching hotel_ids. It then groups the resultset by city and hotel_id, and calculates the average amount paid per hotel by each customer. The final result list is ordered by city and the average amount paid in descending order to provide a detailed comparison of hotel costs across different cities.
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Besides solving the above MakeMyTrip SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like FAANG tech companies and tech startups.
Each DataLemur SQL question has multiple hints, full answers and best of all, there is an interactive SQL code editor so you can instantly run your query and have it executed.
To prep for the MakeMyTrip SQL interview you can also be wise to practice interview questions from other tech companies like:
In case your SQL skills are weak, don't worry about going right into solving questions – go learn SQL with this DataLemur SQL Tutorial.
This tutorial covers things like UNION and different types of joins – both of these pop up routinely in MakeMyTrip interviews.
In addition to SQL interview questions, the other types of questions covered in the MakeMyTrip Data Science Interview are:
To prepare for MakeMyTrip Data Science interviews read the book Ace the Data Science Interview because it's got: