Data Analysts and Data Engineers at Yum Brands write SQL queries to analyze customer behavior patterns, allowing them to uncover insights about customer preferences and spending habits. They also use SQL to streamline supply chain logistics by analyzing sales trends and inventory levels, ensuring that popular menu items are always available for customers, this is the reason why Yum Brands asks SQL problems during interviews.
Thus, to help you study for the Yum Brands SQL interview, we've collected 10 Yum Brands SQL interview questions in this blog.
Yum Brands, the parent company of fast-food chains like KFC, Taco Bell, and Pizza Hut, keep track of all transactions made by customers in a database. A "VIP customer" is defined as a client who spends more in total every month than 95% of all other customers.
Your task is to write a SQL query that identifies these VIP customers. Your response should include the of each VIP customer, along with their total monthly spend.
Consider the following tables, , which logs all transactions made by customers.
transaction_id | user_id | transaction_date | amount |
---|---|---|---|
1001 | 12 | 2022-01-04 | 50 |
1002 | 34 | 2022-01-10 | 65 |
1003 | 56 | 2022-01-15 | 30 |
1004 | 12 | 2022-02-01 | 75 |
1005 | 34 | 2022-02-03 | 100 |
1006 | 12 | 2022-02-14 | 45 |
1007 | 78 | 2022-02-19 | 160 |
Here is a PostgreSQL query to identify the VIP customers:
This PostgreSQL query works by first creating a table that sums the total amount spent by each user every month. Then we calculate the 95th percentile of total spend each month in the table. By joining these two tables, we can then filter for users who spend more than the 95th percentile each month, identifying them as VIP customers.
To solve a related super-user data analysis question on DataLemur's free online SQL coding environment, try this recently asked Microsoft SQL interview question:
Discover Yum Brands' latest updates and see how they are transforming the fast-food landscape with innovative ideas and initiatives! Understanding Yum Brands' approach to growth and customer engagement can offer valuable lessons for anyone interested in the food service industry.
Assume you had a table of Yum Brands employee salaries. Write a SQL query to find the employees who earn more than their direct manager.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
You can solve this problem interactively on DataLemur:
First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
If the solution above is tough, you can find a step-by-step solution with hints here: Well Paid Employees.
The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.
For example, say you had stored some Facebook ad campaign data that Yum Brands ran:
The column uniquely identifies each row in the table, and the PRIMARY KEY constraint ensures that no two rows have the same . This helps to maintain the integrity of the data in the table by preventing duplicate rows.
The primary key is also an important part of the table because it allows you to easily identify and reference specific campaigns in your Facebook Ad data. You can use it to join to other tables in the database, such as a table containing data on the results of the campaigns.
As a data analyst at Yum Brands, which owns a variety of fast food chains including KFC, Taco Bell, and Pizza Hut, you are tasked with analyzing the daily revenue of each outlet. Write a SQL query to calculate the average daily revenue for each outlet in the last 30 days, using window functions.
We have two tables, and .
The table is structured as follows:
outlet_id | outlet_name | brand |
---|---|---|
1 | Taco Bell (Main St) | Taco Bell |
2 | Pizza Hut (3rd Ave) | Pizza Hut |
3 | KFC (Grand Blvd) | KFC |
4 | Taco Bell (5th Ave) | Taco Bell |
The table is structured as follows:
sale_id | outlet_id | sale_date | revenue |
---|---|---|---|
10 | 1 | 2022-09-25 | 400 |
12 | 2 | 2022-09-25 | 500 |
14 | 3 | 2022-09-26 | 600 |
16 | 4 | 2022-09-27 | 700 |
18 | 1 | 2022-09-28 | 800 |
20 | 2 | 2022-09-29 | 900 |
22 | 3 | 2022-09-30 | 1000 |
This query achieves the goal by joining the and tables using the column. The window function is used to calculate the average daily revenue for each outlet over the past 30 days, bucketed by ordered by . The clause specifies that the window includes the current row and the 29 rows preceding it (for a total of 30 days). The clause excludes sales that are older than 30 days.
For more window function practice, solve this Uber SQL problem within DataLemur's interactive SQL code editor:
To locate records in one table that are absent from another, you can use a and then look for values in the right-side table.
For example, say you exported Yum Brands's CRM (Customer Relationship Management) database, and had a table of sales leads, and a second table of companies.
Here's an example of how a query can find all sales leads that are not associated with a company:
This query returns all rows from the sales leads table, along with any matching rows from the companies table. If there is no matching row in the companies table, values will be returned for all of the right table's columns. The clause then filters out any rows where the column is , leaving only sales leads that are NOT associated with a company.
Yum Brands is a multinational company with various brands of restaurants (like Taco Bell and Pizza Hut) spread across different locations. They want to analyze the sales data per each restaurant.
Each restaurant sells a variety of items, and they want to know, for each restaurant, which item made the highest sales in the previous month. For simplicity, consider that data is stored for every month.
You need to design three tables: , and .
The table has details about each restaurant including a unique ID and location of the restaurant. The table has details about each item including a unique ID, the name of the item and price. The table has details about each sale including a unique ID, which restaurant and item made the sale and sale date.
Below is a sample data for the above tables.
restaurant_id | location |
---|---|
1 | New York |
2 | Los Angeles |
3 | Chicago |
item_id | item_name | price |
---|---|---|
1 | Pizza | $10 |
2 | Taco | $5 |
3 | Burger | $8 |
sales_id | restaurant_id | item_id | sale_date |
---|---|---|---|
1 | 1 | 1 | 2022-08-01 |
2 | 1 | 2 | 2022-08-02 |
3 | 1 | 2 | 2022-08-02 |
4 | 2 | 1 | 2022-08-03 |
5 | 2 | 3 | 2022-08-04 |
6 | 2 | 3 | 2022-08-04 |
7 | 3 | 1 | 2022-08-05 |
8 | 3 | 2 | 2022-08-06 |
9 | 3 | 2 | 2022-08-06 |
This PostgreSQL query first joins the , and tables together. It then filters the records for the previous month's sales data using the function. After that, it groups the data by and and calculates the count for each group. The groups are sorted in descending order by count and the highest count, which represents the item with the highest sales for each restaurant, is the first (and only) row that makes it through the clause.
Imagine you are organizing a party and have two database tables: one table of people you want to invite and another list of food items you want to serve.
A cross join would be like inviting every person on your list to the party and serving them every food item on the menu, regardless of whether they like the food or not. So, if you had 10 people on your invite list and 5 food items on the menu, you would generate all 50 different combinations of people and food (10 x 5 = 50).
On the other hand, a natural join would be like inviting only the people who like the food items on the menu (based on doing a inner/left/right/outer JOIN on a common key like ).
Imagine you work as a Data Analyst at Yum Brands, and you're tasked with finding the average monthly sales for each store within a set period of time.
store_id | sale_date | sale_amount |
---|---|---|
001 | 2022-06-05 | 1000 |
001 | 2022-06-23 | 1200 |
001 | 2022-07-19 | 1500 |
002 | 2022-06-15 | 800 |
002 | 2022-07-05 | 900 |
002 | 2022-07-30 | 1000 |
store_id | month | avg_sale_amount |
---|---|---|
001 | 6 | 1100 |
001 | 7 | 1500 |
002 | 6 | 800 |
002 | 7 | 950 |
Here is the PostgreSQL SQL query that you would use to solve this question:
This SQL query applies the average function on the grouped by each and . The order by clause makes it easier to interpret the output by displaying the information in an order by and then .
To practice a very similar question try this interactive Wayfair Y-on-Y Growth Rate Question which is similar for calculating metrics over time intervals or this Amazon Average Review Ratings Question which is similar for calculating averages on grouped data.
Yum Brands, the parent company of fast-food restaurants like KFC, Pizza Hut, and Taco Bell, is interested in understanding the click-through and conversion rates of their online ordering system. Each record in their system captures whether a customer viewed an item, added it to their cart, and then proceeded to check out.
Assume the conversion funnel works such that users view an item, add it to their cart, and then checkout as a three-step process. For a given time period, calculate the click-through rate (CTR) for each product as the ratio of number of adds to number of views, and conversion rate as the ratio of number of checkouts to number of adds.
order_id | user_id | product_id | view_date | add_date | checkout_date |
---|---|---|---|---|---|
10001 | 987 | 10 | 06/08/2022 00:00:00 | 06/08/2022 00:10:00 | 06/08/2022 00:20:00 |
10002 | 563 | 20 | 06/08/2022 01:00:00 | 06/08/2022 01:10:00 | null |
10003 | 563 | 10 | 06/08/2022 02:00:00 | null | 06/08/2022 02:20:00 |
10004 | 643 | 30 | 06/10/2022 00:00:00 | 06/10/2022 00:10:00 | 06/10/2022 00:20:00 |
10005 | 643 | 30 | 06/10/2022 01:00:00 | null | 06/10/2022 01:20:00 |
product_id | CTR | Conversion Rate |
---|---|---|
10 | 1.00 | 0.50 |
20 | 1.00 | 0.00 |
30 | 0.50 | 1.00 |
The SQL query provided calculates the CTR as the number of adds per view and conversion rate as the number of checkouts per add, for each product. The "CASE ... WHEN ... THEN ... ELSE ... END" construct handles cases where there might be no views or adds for a product, which would otherwise lead to a division-by-zero error. The results are grouped by the to give the CTR and conversion rates for each individual product.
To practice another question about calculating rates, solve this TikTok SQL question on DataLemur's online SQL coding environment:
A correlated sub-query is a sub-query that depends on the outer query and cannot be executed separately. It uses the outer query to filter or transform data by referencing a column from it, and the outer query uses the results of the inner query. In contrast, a non-correlated sub-query is independent of the outer query and can be executed on its own. It does not reference any columns from the outer query and is used to retrieve additional data needed by the outer query.
Here is an example of a correlated sub-query:
This query selects the and total sales of all Yum Brands customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().
Here is an example of a non-correlated sub-query:
The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.
Performance-wise, correlated sub-queries are generally slower to execute, as they have to be re-evaluated for each row of the outer query, while non-correlated sub-queries are faster, as they only have to be executed once.
The key to acing a Yum Brands SQL interview is to practice, practice, and then practice some more! Besides solving the above Yum Brands SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Amazon, Microsoft, Meta, and smaller tech companies.
Each interview question has multiple hints, detailed solutions and crucially, there's an interactive coding environment so you can right in the browser run your query and have it checked.
To prep for the Yum Brands SQL interview it is also a great idea to practice SQL problems from other hospitality and restaurant companies like:
In case your SQL skills are weak, forget about jumping right into solving questions – go learn SQL with this free SQL for Data Analytics course.
This tutorial covers topics including math functions like CEIL()/FLOOR() and creating summary stats with GROUP BY – both of which come up often during Yum Brands SQL interviews.
In addition to SQL query questions, the other types of problems tested in the Yum Brands Data Science Interview include:
To prepare for the Yum Brands Data Science interview have a strong understanding of the company's culture and values – this will be key to acing the behavioral interview. For the technical Data Science interviews, get ready by reading Ace the Data Science Interview. The book's got: