McKesson employees rely on SQL to extract and analyze patient healthcare data, which helps them understand trends and improve patient outcomes. They also use SQL to keep track of pharmaceutical inventory across various locations, ensuring that the right medications are available when needed, this is why McKesson includes SQL problems in their interviews for Data Analytics, Data Science, and Data Engineering positions.
So, to help you practice for the McKesson SQL interview, here’s 8 McKesson SQL interview questions in this article.
McKesson is a multi-national company in the healthcare sector, which trades in pharmaceuticals and medical products along with providing related services. They require regular analysis of their sales data for decision making. For this exercise, we are going to analyze the average monthly sales per product using a window function.
Let's say the data they have is in the following table, , which logs each purchase event:
sale_id | date_of_sale | product_id | units_sold | sale_price_per_unit |
---|---|---|---|---|
1001 | 01/05/2022 | A123 | 20 | 10 |
1002 | 01/05/2022 | B456 | 30 | 20 |
1003 | 01/18/2022 | A123 | 15 | 10 |
1004 | 02/05/2022 | B456 | 25 | 20 |
1005 | 02/18/2022 | A123 | 25 | 10 |
1006 | 03/05/2022 | B456 | 40 | 20 |
Our task is to write a SQL query that calculates the average monthly sales per product.
The PostgreSQL query to solve the above question would be:
This query uses the window function to calculate the average monthly sales for each product. The clause is used to divide the sales data into partitions based on the product_id and the month of the sale. The function is used to truncate the date to the nearest month, which allows us to calculate a separate average for each month. The results are then ordered by the month and product_id for easy visualization.
To solve a similar window function interview problem which uses RANK() on DataLemur's free interactive SQL code editor, try this Amazon SQL Interview Question:
Explore McKesson's technology solutions for specialty practices to learn how they are enhancing healthcare delivery through innovative solutions! Understanding McKesson's initiatives can provide valuable insights into the intersection of technology and healthcare.
Given a table of McKesson employee salary data, write a SQL query to find the top 3 highest paid employees in each department.
employee_id | name | salary | department_id |
---|---|---|---|
1 | Emma Thompson | 3800 | 1 |
2 | Daniel Rodriguez | 2230 | 1 |
3 | Olivia Smith | 2000 | 1 |
4 | Noah Johnson | 6800 | 2 |
5 | Sophia Martinez | 1750 | 1 |
8 | William Davis | 6800 | 2 |
10 | James Anderson | 4000 | 1 |
department_id | department_name |
---|---|
1 | Data Analytics |
2 | Data Science |
department_name | name | salary |
---|---|---|
Data Analytics | James Anderson | 4000 |
Data Analytics | Emma Thompson | 3800 |
Data Analytics | Daniel Rodriguez | 2230 |
Data Science | Noah Johnson | 6800 |
Data Science | William Davis | 6800 |
Code your solution to this interview question directly within the browser on DataLemur:
We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.
If the solution above is confusing, you can find a detailed solution with hints here: Top 3 Department Salaries.
Both window functions are used to find rows at a given offset from the current row. However, will give you the rows AFTER the current row you. On the other hand, will give you the rows BEFORE the current row.
Suppose you are a DBA (database administrator) for McKesson, a pharmaceutical distribution company. You are tasked with designing a database to track product shipments. Your system should be able to handle the storage of several entities: products, distribution centers, shipments, and customers. Consider the following requirements:
Design an efficient SQL database schema that can handle these requirements. Also, create a query that can retrieve the total quantity shipped of a specific product in a specific month.
product_id | name | description |
---|---|---|
1 | Drug A | Product description |
2 | Drug B | Product description |
center_id | location |
---|---|
1 | Chicago |
2 | Los Angeles |
shipment_id | product_id | center_id | customer_id | date_shipped | quantity |
---|---|---|---|---|---|
1 | 1 | 1 | 1 | 06/01/2022 | 500 |
2 | 1 | 1 | 2 | 06/10/2022 | 200 |
3 | 2 | 2 | 1 | 06/15/2022 | 300 |
4 | 1 | 2 | 1 | 07/01/2022 | 700 |
customer_id | name | address |
---|---|---|
1 | John Doe | 123 Elm St |
2 | Jane Doe | 456 Oak St |
The SQL query to retrieve the total quantity shipped of a specific product (here, product_id = 1) in a specific month (here, June 2022) would be as follows:
This query uses the aggregate function to sum up the total quantity of the specified product_id shipped within the specified month and year. The function is used to get the month and year components from the shipment date for the condition in the statement.
While knowing this answer is beyond the scope of most Data Analyst & Data Science interviews, Data Engineers at McKesson should be at least aware of SQL vs. NoSQL databases.
Relational databases and non-relational (NoSQL) databases have some key differences, particularly in terms of how data is stored. Whereas relational databases have tables, made up of rows and columns, NoSQL databases use various data models like:
This added flexibility makes NoSQL databases well-suited for handling non-tabular data or data with a constantly changing format. However, this flexibility comes at the cost of ACID compliance, which is a set of properties (atomic, consistent, isolated, and durable) that ensure the reliability and integrity of data in a database. While most relational databases are ACID-compliant, NoSQL databases may not provide the same level of guarantees.
McKesson is a healthcare company that offers a number of products and services. To maximize their outreach, they have run several online advertising campaigns. Each campaign showed a specific ad, and when the ad was clicked, it redirected the user to McKesson's website where they could view the product and possibly add it to their cart.
As a data analyst, you have been asked to compute the click-through rate, which is the ratio of users who clicked a specific ad to the total users who viewed the ad. You've been also tasked to compute the conversion rate for each product, which is defined as the ratio of the users who added a product to their cart to the total users who clicked the associated ad.
Here are some example datasets:
ad_id | product_id |
---|---|
1 | 101 |
2 | 102 |
3 | 103 |
user_id | ad_id | view_date |
---|---|---|
123 | 1 | 2022-07-07 |
124 | 2 | 2022-07-07 |
125 | 3 | 2022-07-07 |
126 | 1 | 2022-07-07 |
127 | 2 | 2022-07-07 |
user_id | ad_id | click_date |
---|---|---|
123 | 1 | 2022-07-08 |
124 | 2 | 2022-07-08 |
125 | 3 | 2022-07-08 |
126 | 1 | 2022-07-08 |
user_id | product_id | add_to_cart_date |
---|---|---|
123 | 101 | 2022-07-08 |
124 | 102 | 2022-07-08 |
125 | 103 | 2022-07-08 |
Here is a PostgreSQL solution to calculate the click-through rate and the conversion rate for each product:
This query first computes the click-through and conversion rates independently of each other (using the WITH clauses creating and ), and then merges these two results using a LEFT JOIN. The conversion and click-through rates are computed by counting the unique users that performed the positive event (clicking an ad or adding a product to cart) and dividing this by the count of unique users that had the opportunity to perform that action (view an ad or click an ad).
To practice a similar problem on DataLemur's free online SQL coding environment, solve this SQL interview question asked by Facebook:
A foreign key is a field in a database table that links to the primary key of another table, establishing a connection between the two tables. For example, let's analyze McKesson's Google Ads campaigns data:
ad_id | campaign_id | keyword | click_count |
---|---|---|---|
1 | 201 | McKesson reviews | 120 |
2 | 202 | McKesson pricing | 150 |
3 | 101 | buy McKesson | 65 |
4 | 101 | McKesson alternatives | 135 |
is a foreign key that connects to the of the corresponding Google Ads campaign. This establishes a relationship between the ads and their campaigns, enabling easy querying to find which ads belong to a specific campaign or which campaigns a specific ad belongs to.
The table may also have multiple foreign keys that reference primary keys in other tables. For example, and foreign keys could be used to link each ad to its ad group and the Google Ads account that the campaigns belong to, respectively.
Suppose you are working as a Data Analyst in McKesson, a pharmaceutical and medical supplies company. You have been tasked with finding a list of clients in the company's database whose first names start with 'John'. This list should include the clients' full names, addresses, and emails.
For the purpose of this task, assume that the client information is stored in a single table, .
customerId | firstName | lastName | address | |
---|---|---|---|---|
001 | John | Doe | 1234 Broadway St. | john.doe@example.com |
002 | Johnny | Appleseed | 5678 Main St. | johnny.appleseed@example.com |
003 | Jane | Doe | 9012 Park Ave. | jane.doe@example.com |
004 | Jonathan | Smith | 3456 Oak St. | jonathan.smith@example.com |
005 | Jack | Brown | 7890 Elm St. | jack.brown@example.com |
In PostgreSQL, you would write the following query to accomplish this task:
This query uses the keyword to filter the rows where the column starts with 'John'. The '%' character is a wildcard that matches any sequence of characters, so 'John%' will match any string that starts with 'John'. The filtered result will consist of all the columns specified within the statement for the matching records.
firstName | lastName | address | |
---|---|---|---|
John | Doe | 1234 Broadway St. | john.doe@example.com |
Johnny | Appleseed | 5678 Main St. | johnny.appleseed@example.com |
Jonathan | Smith | 3456 Oak St. | jonathan.smith@example.com |
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. In addition to solving the earlier McKesson SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like FAANG tech companies and tech startups.
Each interview question has hints to guide you, step-by-step solutions and most importantly, there's an interactive SQL code editor so you can easily right in the browser your SQL query and have it checked.
To prep for the McKesson SQL interview you can also be useful to practice interview questions from other healthcare and pharmaceutical companies like:
In case your SQL coding skills are weak, forget about jumping right into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.
This tutorial covers SQL topics like filtering data with WHERE and GROUP BY – both of these show up often in McKesson SQL assessments.
In addition to SQL query questions, the other topics to prepare for the McKesson Data Science Interview are:
I think the best way to prep for McKesson Data Science interviews is to read the book I wrote: Ace the Data Science Interview.
It solves 201 interview questions sourced from Facebook, Google & startups. The book's also got a refresher covering Stats, ML, & Data Case Studies. And finally it's helped a TON of people, which is how it's gotten to be an Amazon best-seller with over 1000+ 5-star reviews.
While the book is more technical, it's also key to prepare for the McKesson behavioral interview. A good place to start is by reading the company's culture and values.