logo

11 McKinsey SQL Interview Questions (Updated 2024)

Updated on

August 11, 2024

At McKinsey, SQL is used to extract and manipulate client data on sales performance and market trends for analytics projects, as well as to build data models that inform business strategy decisions by identifying areas for improvement and growth opportunities. This is why McKinsey includes SQL questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.

As such, to help you practice for the McKinsey SQL interview, we'll cover 11 McKinsey SQL interview questions in this article.

McKinsey SQL Interview Questions

11 McKinsey SQL Interview Questions

SQL Question 1: 3-Topping Pizzas

You’re a consultant for a major pizza chain that will be running a promotion where all 3-topping pizzas will be sold for a fixed price, and are trying to understand the costs involved.

Given a list of pizza toppings, consider all the possible 3-topping pizzas, and print out the total cost of those 3 toppings. Sort the results with the highest total cost on the top followed by pizza toppings in ascending order.

Break ties by listing the ingredients in alphabetical order, starting from the first ingredient, followed by the second and third.

Table:

Column NameType
topping_namevarchar(255)
ingredient_costdecimal(10,2)

Example Input:

topping_nameingredient_cost
Pepperoni0.50
Sausage0.70
Chicken0.55
Extra Cheese0.40

Example Output:

pizzatotal_cost
Chicken,Pepperoni,Sausage1.75
Chicken,Extra Cheese,Sausage1.65
Extra Cheese,Pepperoni,Sausage1.60
Chicken,Extra Cheese,Pepperoni1.45

Answer:


To see not just one but TWO additional solutions try out this McKinsey SQL Interview Question on DataLemur!

McKinsey SQL Interview Question

SQL Question 2: Second Highest Salary

Given a table of McKinsey employee salary information, write a SQL query to find the 2nd highest salary at the company.

McKinsey Example Input:

employee_idsalary
12500
2800
31000
41200

Example Output:

second_highest_salary
1200

Code your solution to this question directly within the browser on DataLemur:

2nd Highest Salary SQL Interview Question

Answer:


You can find a step-by-step solution here: 2nd Highest Salary

SQL Question 3: Can you describe the difference between a correlated and a non-correlated sub-query?

A correlated sub-query is a sub-query that depends on the outer query and cannot be executed separately. It uses the outer query to filter or transform data by referencing a column from it, and the outer query uses the results of the inner query. In contrast, a non-correlated sub-query is independent of the outer query and can be executed on its own. It does not reference any columns from the outer query and is used to retrieve additional data needed by the outer query.

Here is an example of a correlated sub-query:


This query selects the and total sales of all McKinsey customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().

Here is an example of a non-correlated sub-query:


The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.

Performance-wise, correlated sub-queries are generally slower to execute, as they have to be re-evaluated for each row of the outer query, while non-correlated sub-queries are faster, as they only have to be executed once.

McKinsey SQL Interview Questions

SQL Question 4: Average Rating Product Analysis

As an analyst at McKinsey, you are asked to analyze McKinsey's hypothetical online platform where different products are reviewed by users. Your task is to write a SQL query using window functions that returns the average user rating of each product for each month.

Sample Input:
review_iduser_idsubmit_dateproduct_idstars
61711232022-06-08500014
78022652022-06-10698524
52933622022-06-18500013
63521922022-07-26698523
45179812022-07-05698522

Answer:


This query first groups the data by product and month with . Then, it uses the function to calculate the average number of stars for each product each month. The function truncates the submission date to the month (making all days and lower units same within each month) making it easier to group by month. The clause orders the result by month and product for better readability.

Example Output:
mthproductavg_stars
2022-06-01 00:00:00500013.50
2022-06-01 00:00:00698524.00
2022-07-01 00:00:00698522.50

p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur

SQL Interview Questions on DataLemur

SQL Question 5: How do you identify duplicated data in a table?

One creative way is to use the window function ROW_NUMBER() and partition by whichver columns you are loooking for duplicates in. If any row has a row_number() more than 1, then it's a duplicate!


You could also use COUNT(DISTINCT col1) vs. COUNT(*) and see if they return different numbers, indicating the prescence of duplicates!

SQL Question 6: Analyze Consulting Projects

McKinsey, being a global management consulting firm, conducts multiple consulting projects in various industries for many clients worldwide. The firm wishes to analyze its project data to guide decision-making processes. It aims to answer questions such as - What is the total revenue earned from different industries? Which clients have been with us for the longest duration and what is their total spending?

Assume that the McKinsey database has two tables:

  • (project_id, consultant_id, client_id, startDate, endDate, revenue)
  • (client_id, name, industry)

Here is the sample data for our problem:

Example Input:
project_idconsultant_idclient_idstartDateendDaterevenue
10011356512019-01-152019-04-1550000
10022461932020-05-012020-08-3040000
10031356512020-09-152021-01-1555000
10042461932021-02-012021-07-0142000
10051359102021-07-152022-01-1548000
Example Input:
client_idnameindustry
651Company AFinance
193Company BTechnology
910Company CHealthcare

Write a SQL query to determine the total revenue from each industry.

Answer:

The PostgreSQL query to solve this problem would be:


This query first joins the and tables on . Then it groups the result by and computes the sum of the for each industry. The total revenue from each industry is returned by this query.

SQL Question 7: In the context of a database transaction, what does ACID mean?

A transaction is a one or more SQL commands which are executed as a singular unit if the transaction -commits- (or no execution happens if the transaction -aborts-).

For transactions, a DBMS is supposed to enforce the following ACID properties: Atomicity, Consistency, Isolation, & Durability.

Here's what each one means:

  • Atomicity: the transaction is completed in an all-or-nothing way (no partial commits)
  • Consistency: the transaction is valid and follows all constraints and restrictions
  • Isolation: the transaction doesn't affect another transaction
  • Durability: the committed transactions is stored permanently in the DB (it doesn't disappear!)

As you can see, it's pretty important for the multiple databases where McKinsey store's it's data to be ACID-compliant!

SQL Question 8: Filter Customers Based on Revenue and Location

To optimize the targeting of marketing campaigns, you're interested in the customers who have spent over $10000 in total and are located either in the 'New York' or 'Los Angeles' area. Write a SQL query to find these customers from the and tables.

The table has these columns: , , , , and . The table has these columns: , , , , , and .

Example Input:

customer_idcustomer_namecitystatezipcode
1John DoeNew YorkNY10001
2Jane SmithLos AngelesCA90001

Example Input:

order_idcustomer_idproduct_idquantityunit_priceorder_date
110015200006/08/2022 00:00:00
210022300006/10/2022 00:00:00
320011500006/18/2022 00:00:00
420023150007/26/2022 00:00:00

Answer:


This SQL query first joins the and tables on the field. Then it filters the results to include only those customers located in 'New York' or 'Los Angeles'. Finally, it aggregates the total amount spent by each customer and only includes those who have spent over $10000.

SQL Question 9: Average Consultation Ratings by Sector

As an analytics data analyst at McKinsey, we are given a database with consultation details where each consultation has a rating (on a scale of 1 to 5) by the client. Each consultancy project is associated with a specific sector of service offered by McKinsey like 'Finance', 'Healthcare', 'Government', 'Retail' etc. Write a SQL query to find the average rating of the consultancies by sector.

Example Input:
consultation_idproject_idconsultant_idclient_idend_daterating
1001150012001300106/06/20225
1002150022002300206/06/20224
1003150012001300306/08/20224
1004150032003300406/10/20225
1005150022002300507/08/20223
Example Input:
project_idsector
15001Finance
15002Healthcare
15003Government
Example Output:
sectoraverage_rating
Finance4.50
Healthcare3.50
Government5.00

Answer:


This query first joins the and tables on . It then groups the rows by and calculates the average for each sector.

SQL Question 10: What's the purpose of the function in SQL?

The function is used to remove NULLs, and replace them with another value.

For example, say you were a Data Analyst at McKinsey and were doing a customer analytics project to see who the most marketing-engaged customers were, and had access to the below table.

mckinsey_customers:

customer_idemail_engagementsms_engagement
101very_activenot_opted_in
201un-subscribedNULL
301NULLnot_opted_in
401not_activevery_active
501very_activemildly_active
303NULLNULL

Before you could procede, you had to remove the NULLs, and replace them with the default value (not_active for email, and not_opted_in for SMS). To do this, you'd run the following query:


This would get you the following output:

customer_idemail_engagementsms_engagement
101very_activenot_opted_in
201un-subscribednot_opted_in
301not_activenot_opted_in
401not_activevery_active
501very_activemildly_active
303not_activenot_opted_in

SQL Question 11: Finding McKinsey Clients in a Specific Industry

As a business analyst at McKinsey, you are often required to filter down the client records database to find records that match a specific pattern or string. For instance, you might be tasked with finding all clients who are from the 'Finance' industry. Write an SQL query which will extract all client records where the column contains the string 'Finance'.

Example Input:
client_idclient_nameindustryregionaccount_manager
3721Goldman SachsFinanceNorth AmericaAlex
7865AmazonE-commerceNorth AmericaMaria
6373Chase BankFinance and BankingNorth AmericaAlex
5412Alibaba GroupE-commerceAsiaSophie
7853CitigroupFinanceNorth AmericaAlex
Example Output:
client_idclient_nameindustryregionaccount_manager
3721Goldman SachsFinanceNorth AmericaAlex
6373Chase BankFinance and BankingNorth AmericaAlex
7853CitigroupFinanceNorth AmericaAlex

Answer:


This SQL query uses the LIKE keyword in combination with the wildcard character (%) to filter for all client records where the column contains the string 'Finance'. As the % symbol is used both before and after 'Finance', this will ensure that the query returns any values that contain 'Finance', irrespective of what comes before or after it.

How To Prepare for the McKinsey SQL Interview

The key to acing a McKinsey SQL interview is to practice, practice, and then practice some more! Beyond just solving the above McKinsey SQL interview questions, you should also solve the 200+ SQL Interview Questions on DataLemur which come from companies like Google, Uber, and Microsoft.

DataLemur SQL Interview Questions

Each interview question has multiple hints, detailed solutions and best of all, there is an interactive coding environment so you can right in the browser run your query and have it graded.

Learn how McKinsey's Digital & Analytics is helping governments innovate, transform, and thrive in a rapidly changing world!

To prep for the McKinsey SQL interview you can also be wise to solve interview questions from other management consulting companies like:

However, if your SQL skills are weak, forget about going right into solving questions – improve your SQL foundations with this SQL interview tutorial.

DataLemur SQL tutorial

This tutorial covers things like math functions like ROUND()/CEIL() and Subqueries – both of these show up frequently in SQL job interviews at McKinsey.

McKinsey Data Science Interview Tips

What Do McKinsey Data Science Interviews Cover?

In addition to SQL query questions, the other types of problems to practice for the McKinsey Data Science Interview are:

McKinsey Data Scientist

How To Prepare for McKinsey Data Science Interviews?

I believe the optimal way to study for McKinsey Data Science interviews is to read the book I wrote: Ace the Data Science Interview.

It has 201 data interview questions sourced from companies like Google, Tesla, & Goldman Sachs. The book's also got a crash course on Product Analytics, SQL & ML. And finally it's helped thousands of people land their dream job in data, which is why it's earned 1000+ reviews on Amazon and a 4.5-star rating.

Nick Singh author of the book Ace the Data Science Interview

While the book is more technical in nature, it's also important to prepare for the McKinsey behavioral interview. Start by reading the company's values and mission.