At Booking.com, MySQL is used often for analyzing customer booking trends and optimizing pricing strategies based on regional and seasonal data trends. Because of this, Booking asks SQL problems during interviews for Data Science, Analytics, and & Data Engineering jobs.
So, to help you prepare for a Booking.com SQL Assessment, here’s 10 Booking SQL interview questions to practice – can you solve them?
As a data analyst at Booking, one of your role is to identify VIP users. These are individuals who make several hotel bookings, particularly for high-end (5-star) properties monthly. The data is stored in two primary tables and as shown below.
booking_id | user_id | booking_date | hotel_id |
---|---|---|---|
9001 | 567 | 2018-06-15 00:00:00 | 70001 |
9502 | 890 | 2018-06-25 00:00:00 | 80052 |
7704 | 123 | 2018-07-18 00:00:00 | 80052 |
6925 | 567 | 2018-07-01 00:00:00 | 70001 |
4289 | 234 | 2018-07-05 00:00:00 | 70001 |
hotel_id | rank |
---|---|
70001 | 5 |
80052 | 4 |
79011 | 4 |
Given these two tables, write a query that returns each user along with the number of high-end hotel bookings they have made within each month. Assume that a high-end hotel has a rank of 5.
In the above query, we join the table with the table on the then filter out all bookings which are not for high-end hotels (). The result is grouped by month and user_id to give the total number of bookings per user each month. The result is then ordered in descending order of number_of_bookings to have the users with the highest number of bookings at the top.
To solve a related customer analytics question on DataLemur's free online SQL code editor, try this Microsoft Azure Cloud SQL Interview Question:
In the context of a booking company like Booking.com, we have received detailed user reviews about hotels over the years. The task is to analyze and calculate the average review score for each hotel per month. This will help to understand the performance trend and quality of each hotel over time.
You are provided with a table with hotel_id, user_id, submit_date and stars (score out of 5) columns. Write a SQL query to return the average monthly review score for each hotel.
review_id | user_id | submit_date | hotel_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 50001 | 4 |
7802 | 265 | 2022-06-10 | 69852 | 4 |
5293 | 362 | 2022-06-18 | 50001 | 3 |
6352 | 192 | 2022-07-26 | 69852 | 3 |
4517 | 981 | 2022-07-05 | 69852 | 2 |
mth | hotel_id | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
This SQL query uses the GROUP BY clause to organize data into groups by month and hotel_id. Then, it uses the AVG() function to calculate the average rating for each group. EXTRACT(MONTH FROM submit_date) is used to get the month part from the date column ‘submit_date’ and 'hotel_id' identifies each individual hotel. Finally, it selects the calculated average rating 'avg_stars' for each hotel and for each month.
To solve a related window function SQL problem on DataLemur's free interactive coding environment, solve this Google SQL Interview Question:
Database views are created to provide customized, read-only versions of your data that you can query just like a regular table. So why even use one if they're so similar to a regular table?
Views are advantageous for several reasons:
Given the business needs of Booking.com, your task is to design a database schema to handle their inventory of hotels, rooms, customers, and bookings. Once you have a design in mind, generate some sample data for testing. Then, to prove the design's utility, please write a PostgreSQL query that returns a list of all hotels booked by a specific user during a specific time period.
hotel_id | hotel_name | location |
---|---|---|
1 | Hotel California | California, USA |
2 | The Ritz | London, UK |
3 | Rydges | Sydney, AU |
room_id | hotel_id | room_type | price_per_night |
---|---|---|---|
1 | 1 | Single | 70 |
2 | 1 | Double | 90 |
3 | 2 | Luxury Suite | 300 |
customer_id | name | |
---|---|---|
1 | John Doe | johndoe@email.com |
2 | Jane Smith | janesmith@email.com |
3 | James Brown | jamesbrown@email.com |
booking_id | room_id | customer_id | check_in_date | check_out_date |
---|---|---|---|---|
1 | 1 | 1 | 2022-07-01 | 2022-07-07 |
2 | 3 | 2 | 2022-08-15 | 2022-08-19 |
3 | 2 | 1 | 2022-09-01 | 2022-09-10 |
In this PostgreSQL query, we first join the , , and tables using their respective foreign keys. We then apply a clause to filter the records by the given and the date range. The keyword ensures we don't get duplicate records in our result. The placeholders , , and would be replaced by the actual values during implementation. undefined
A full outer join returns all rows from both tables, including any unmatched rows, whereas an inner join only returns rows that match the join condition between the two tables.
For an example of each one, say you had sales data exported from Booking's Salesforce CRM stored in a datawarehouse which had two tables: and .
: retrieves rows from both tables where there is a match in the shared key or keys.
This query will return rows from the sales and tables that have matching customer id values. Only rows with matching values will be included in the results.
: retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.
Here is an example of a SQL full outer join using the sales and tables:
As part of the Booking team, we want to identify customers who have made a booking in the past month and who have also given a review score of more than 3 stars, for any of their bookings. Write an SQL query to accomplish this, by using a three-table join strategy that includes the , , and tables.
customer_id | first_name | last_name |
---|---|---|
101 | John | Doe |
102 | Jane | Smith |
103 | Mary | White |
booking_id | customer_id | booking_date |
---|---|---|
1001 | 101 | 2022-09-08 |
1002 | 102 | 2022-09-15 |
1003 | 101 | 2022-08-25 |
review_id | booking_id | star_rating |
---|---|---|
201 | 1001 | 5 |
202 | 1002 | 2 |
203 | 1003 | 5 |
This solution joins the and tables with the table to find all bookings made within the past month () and then filters by review_score (), thus identifying customers who have made a booking in the past month and who have also given a review score of more than 3 stars. The DISTINCT keyword is used to ensure we do not have duplicate customer entries in our result set. undefined
NULLs are NOT the same as zero or blank spaces in SQL. NULLs are used to represent a missing value or the abscence of a value, whereas zero and blank space are legitimate values.
It's important to handle NULLs carefully, because they can mess up your analysis very easily. For example, if you compare a NULL value using the = operator, the result will always be NULL (because just like Drake, nothing be dared compared to NULL). That's why many data analysis in SQL start with removing NULLs using the function.
Booking.com will often look at click-through rates of their digital ad campaigns to understand the effectiveness of the marketing activities. Given the and tables which track the viewing and clicking activities respectively of users on the digital ads that Booking.com run on various platforms, write a query to calculate the CTR (number of clicks/number of views) for each digital advertisement.
view_id | user_id | view_date | ad_id |
---|---|---|---|
6001 | 121 | 06/08/2022 00:00:00 | 50101 |
6502 | 129 | 06/10/2022 00:00:00 | 69235 |
5623 | 135 | 06/18/2022 00:00:00 | 50101 |
5392 | 157 | 07/26/2022 00:00:00 | 69235 |
4577 | 165 | 07/05/2022 00:00:00 | 50101 |
click_id | user_id | click_date | ad_id |
---|---|---|---|
7689 | 121 | 06/08/2022 00:00:00 | 50101 |
5029 | 155 | 06/10/2022 00:00:00 | 69235 |
3645 | 135 | 06/18/2022 00:00:00 | 50101 |
4587 | 173 | 07/26/2022 00:00:00 | 50101 |
6054 | 165 | 07/05/2022 00:00:00 | 50101 |
This SQL query first joins to to get corresponding lines of viewing and clicking ads. By grouping the result set by , it then calculates the click-through-rate (CTR). The gives us the number of clicks per view, effectively giving us the click-through-rate for each ad. The is to convert the count to a float for a correct division result.
To practice a related SQL problem on DataLemur's free interactive SQL code editor, try this SQL interview question asked by Facebook:
As a data analyst at Booking.com, we have a customer records database with different data columns. One of it refers to the UserAgent details. This data helps us in understanding the customers from a technological view, analyzing which platform, device or browser the customers use to access our services.
Your task is to write a SQL query which filters out all customers that used an iPhone. Let's assume the data record in the "user_agent" column for iPhone users contains the string 'iphone'.
customer_id | user_name | user_email | user_agent |
---|---|---|---|
101 | John | john@mail.com | Mozilla/5.0 (iPhone; CPU iPhone OS 10_3 like Mac OS X) |
102 | Jane | jane@mail.com | Mozilla/5.0 (Windows NT 10.0; Win64; x64) |
103 | Tom | tom@mail.com | Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) |
104 | Jerry | jerry@mail.com | Mozilla/5.0 (Windows NT 6.1; Win64; x64) |
105 | Sophia | sophia@mail.com | Mozilla/5.0 (iPhone; CPU iPhone OS 9_3 like Mac OS X) |
This query will select all records from the "customers" table where the "user_agent" contains the term "iphone". The '%' symbol is a wildcard in SQL that matches any sequence of characters. Also, the function is used to ensure the query is case insensitive. undefined
When designing a database schema, a one-to-one relationship between two entities is characterized by each entity being related to a single instance of the other. An example of this is the relationship between a car and a license plate - each car has one license plate, and each license plate belongs to one car.
On the other hand, a one-to-many relationship is when one entity can be associated with multiple instances of the other entity. For example, a person can have multiple email addresses, but each email address only relates back to one person.
The key to acing a Booking SQL interview is to practice, practice, and then practice some more! Besides solving the above Booking SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like FAANG and tech startups.
Each DataLemur SQL question has multiple hints, detailed solutions and crucially, there's an online SQL code editor so you can right online code up your SQL query and have it checked.
To prep for the Booking SQL interview you can also be wise to practice SQL problems from other tech companies like:
But if your SQL foundations are weak, don't worry about jumping right into solving questions – go learn SQL with this interactive SQL tutorial.
This tutorial covers SQL topics like joining a table to itself and sorting results with ORDER BY – both of which come up often during Booking SQL interviews.
For the Booking Data Science Interview, besides SQL questions, the other types of questions to practice:
To prepare for Booking Data Science interviews read the book Ace the Data Science Interview because it's got: