Looking to ace your next SQL interview with confidence? Test yourself with these 20 advanced SQL interview questions, each accompanied by detailed answers to help you impress recruiters and land that dream job.
These questions cover everything from complex queries to data manipulation and optimization techniques, and equip you with the knowledge you need to excel in any SQL interview scenario.
How many questions can you solve?
Interview questions that test practical and conceptual SQL skills are to test the candidate's ability to translate knowledge into effective solutions. The following questions are 20 SQL interview questions pulled from REAL interviews, that helped managers assess candidates' SQL skills.
While we have included sample answers for each question, remember that these are not the only correct answers. There are often multiple ways to achieve the same goal. Focus on writing clean and functional code.
Stripe asked this tricky SQL interview question, about identifying any payments made at the same merchant with the same credit card for the same amount within 10 minutes of each other and reporting the count of such repeated payments.
Example Input:
transaction_id | merchant_id | credit_card_id | amount | transaction_timestamp |
---|---|---|---|---|
1 | 101 | 1 | 100 | 09/25/2022 12:00:00 |
2 | 101 | 1 | 100 | 09/25/2022 12:08:00 |
3 | 101 | 1 | 100 | 09/25/2022 12:28:00 |
4 | 102 | 2 | 300 | 09/25/2022 12:00:00 |
6 | 102 | 2 | 400 | 09/25/2022 14:00:00 |
Example Output:
payment_count |
---|
1 |
You can solve this Stripe SQL problem interactively and get a full solution explanation:
Recursive queries in SQL enable iteration over hierarchical data structures like organizational charts, bill-of-materials, or nested comments. For example, you could use a recursive query to traverse a hierarchical data structure and retrieve all descendants of a specific node.
Google’s Marketing Team needed to add a simple statistic to their upcoming Superbowl Ad: the median number of searches made per year. You were given a summary table that tells you the number of searches made last year, write a query to report the median searches made per user.
Table:
Column Name | Type |
---|---|
searches | integer |
num_users | integer |
Example Input:
searches | num_users |
---|---|
1 | 2 |
2 | 2 |
3 | 3 |
4 | 1 |
Example Output:
median |
---|
2.5 |
You can solve this Google SQL problem interactively, and get the full solution explanation.
Enjoyed this one? p.s. here's more Google SQL Interview Questions to practice!
ACID properties are:
Say you have access to all the transactions for a given merchant account. Write a query to print the cumulative balance of the merchant account at the end of each day, with the total balance reset back to zero at the end of the month. Output the transaction date and cumulative balance.
Table:
Column Name | Type |
---|---|
transaction_id | integer |
type | string ('deposit', 'withdrawal') |
amount | decimal |
transaction_date | timestamp |
Example Input:
transaction_id | type | amount | transaction_date |
---|---|---|---|
19153 | deposit | 65.90 | 07/10/2022 10:00:00 |
53151 | deposit | 178.55 | 07/08/2022 10:00:00 |
29776 | withdrawal | 25.90 | 07/08/2022 10:00:00 |
16461 | withdrawal | 45.99 | 07/08/2022 10:00:00 |
77134 | deposit | 32.60 | 07/10/2022 10:00:00 |
Example Output:
transaction_date | balance |
---|---|
07/08/2022 12:00:00 | 106.66 |
07/10/2022 12:00:00 | 205.16 |
You can solve this Visa SQL problem interactively, and get the full solution explanation.
Correlated subqueries are executed once for each row of the outer query and depend on the outer query for results. For example, you could use a correlated subquery to retrieve data from one table according to conditions in a separate table.
Fleets of servers power Amazon Web Services (AWS). Senior management has requested data-driven solutions to optimize server usage.
Write a query that calculates the total time that the fleet of servers was running. The output should be in units of full days.
Assumptions:
Table:
Column Name | Type |
---|---|
server_id | integer |
status_time | timestamp |
session_status | string |
Example Input:
server_id | status_time | session_status |
---|---|---|
1 | 08/02/2022 10:00:00 | start |
1 | 08/04/2022 10:00:00 | stop |
2 | 08/17/2022 10:00:00 | start |
2 | 08/24/2022 10:00:00 | stop |
Example Output:
total_uptime_days |
---|
21 |
You can solve this Amazon SQL problem interactively, and get the full solution explanation.
Looking for more Amazon SQL Interview Questions? Check out our Amazon SQL Interview Guide for access to even more questions and solutions!
Indexing in a relational database involves creating data structures (indexes) that allow for faster data retrieval by providing direct access to rows based on the indexed columns. The benefits of indexing include improved query performance and faster data retrieval, especially for large datasets. However, indexing can also introduce overhead during data modification operations (such as INSERT, UPDATE, and DELETE) and increased storage requirements.
As a Data Analyst on the People Operations team at Accenture, you are tasked with understanding how many consultants are staffed to each client, and how many consultants are exclusively staffed to a single client.
Write a query that displays the outputs of client name and the number of uniquely and exclusively staffed consultants ordered by client name.
Table:
Column Name | Type |
---|---|
employee_id | integer |
engagement_id | integer |
Example Input:
employee_id | engagement_id |
---|---|
1001 | 1 |
1001 | 2 |
1002 | 1 |
1003 | 3 |
1004 | 4 |
Table:
Column Name | Type |
---|---|
engagement_id | integer |
project_name | string |
client_name | string |
Example Input:
engagement_id | project_name | client_name |
---|---|---|
1 | SAP Logistics Modernization | Department of Defense |
2 | Oracle Cloud Migration | Department of Education |
3 | Trust & Safety Operations | |
4 | SAP IoT Cloud Integration |
Example Output:
client_name | total_staffed | exclusive_staffed |
---|---|---|
Department of Defense | 2 | 1 |
Department of Education | 1 | 0 |
2 | 2 |
You can solve this Accenture SQL problem interactively, and get the full solution explanation.
Stored procedures and user-defined functions (UDFs) are both reusable pieces of code in SQL, but they have some key differences. Stored procedures can perform DML operations and transaction management, while UDFs are limited to returning values and cannot perform DML operations. Stored procedures are typically used for procedural tasks and complex business logic, while UDFs are often used for calculations and data transformations.
Facebook wants to recommend new friends to people who show interest in attending 2 or more of the same private events.
Sort your results in order of user_a_id and user_b_id (refer to the Example Output below). Table:
Column Name | Type |
---|---|
user_a_id | integer |
user_b_id | integer |
status | enum ('friends', 'not_friends') |
Each row of this table indicates the status of the friendship between user_a_id and user_b_id.
Example Input:
user_a_id | user_b_id | status |
---|---|---|
111 | 333 | not_friends |
222 | 333 | not_friends |
333 | 222 | not_friends |
222 | 111 | friends |
111 | 222 | friends |
333 | 111 | not_friends |
Table:
Column Name | Type |
---|---|
user_id | integer |
event_id | integer |
event_type | enum ('public', 'private') |
attendance_status | enum ('going', 'maybe', 'not_going') |
event_date | date |
Example Input:
user_id | event_id | event_type | attendance_status | event_date |
---|---|---|---|---|
111 | 567 | public | going | 07/12/2022 |
222 | 789 | private | going | 07/15/2022 |
333 | 789 | private | maybe | 07/15/2022 |
111 | 234 | private | not_going | 07/18/2022 |
222 | 234 | private | going | 07/18/2022 |
333 | 234 | private | going | 07/18/2022 |
Example Output:
user_a_id | user_b_id |
---|---|
222 | 333 |
333 | 222 |
Users 222 and 333 who are not friends have shown interest in attending 2 or more of the same private events.
You can solve this Facebook SQL problem interactively, and get the full solution explanation.
Looking for more Facebook SQL Interview Questions? Check out our Facebook SQL Interview Guide for access to even more questions and solutions!
The MERGE statement performs INSERT, UPDATE, or DELETE operations on a target table based on the results of a join with a source table.
Example:
You’re a consultant for a major pizza chain that will be running a promotion where all 3-topping pizzas will be sold for a fixed price, and are trying to understand the costs involved.
Given a list of pizza toppings, consider all the possible 3-topping pizzas, and print out the total cost of those 3 toppings. Sort the results with the highest total cost on the top followed by pizza toppings in ascending order.
Break ties by listing the ingredients in alphabetical order, starting from the first ingredient, followed by the second and third.
** Table:**
Column Name | Type |
---|---|
topping_name | varchar(255) |
ingredient_cost | decimal(10,2) |
Example Input:
topping_name | ingredient_cost |
---|---|
Pepperoni | 0.50 |
Sausage | 0.70 |
Chicken | 0.55 |
Extra Cheese | 0.40 |
Example Output:
pizza | total_cost |
---|---|
Chicken,Pepperoni,Sausage | 1.75 |
Chicken,Extra Cheese,Sausage | 1.65 |
Extra Cheese,Pepperoni,Sausage | 1.60 |
Chicken,Extra Cheese,Pepperoni | 1.45 |
You can solve this McKinsey SQL problem interactively AND see alternate solutions, and get the full solution explanation.
A clustered index determines the physical order of data in a table, while a non-clustered index does not. A table can have only one clustered index but multiple non-clustered indexes.
The Apple retention team needs your help to investigate buying patterns. Write a query to determine the percentage of buyers who bought AirPods directly after they bought iPhones. Round your answer to a percentage (i.e. 20 for 20%, 50 for 50) with no decimals.
Table:
Column Name | Type |
---|---|
transaction_id | integer |
customer_id | integer |
product_name | varchar |
transaction_timestamp | datetime |
Example Input:
transaction_id | customer_id | product_name | transaction_timestamp |
---|---|---|---|
1 | 101 | iPhone | 08/08/2022 00:00:00 |
2 | 101 | AirPods | 08/08/2022 00:00:00 |
5 | 301 | iPhone | 09/05/2022 00:00:00 |
6 | 301 | iPad | 09/06/2022 00:00:00 |
7 | 301 | AirPods | 09/07/2022 00:00:00 |
Example Output:
follow_up_percentage |
---|
50 |
You can solve this Apple SQL problem interactively, and get the full solution explanation.
Looking for more Apple SQL Interview Questions? Check out our Apple SQL Interview Guide for access to even more questions and solutions!
The LAG and LEAD functions allow accessing data from previous or subsequent rows in a result set.
Example:
As a Data Analyst on Snowflake's Marketing Analytics team, your objective is to analyze customer relationship management (CRM) data and identify contacts that satisfy two conditions:
Marketing touches, also known as touch points, represent the interactions or points of contact between a brand and its customers.
Your goal is to generate a list of email addresses for these contacts.
Table:
Column Name | Type |
---|---|
event_id | integer |
contact_id | integer |
event_type | string ('webinar', 'conference_registration', 'trial_request') |
event_date | date |
Example Input:
event_id | contact_id | event_type | event_date |
---|---|---|---|
1 | 1 | webinar | 4/17/2022 |
2 | 1 | trial_request | 4/23/2022 |
3 | 1 | whitepaper_download | 4/30/2022 |
4 | 2 | handson_lab | 4/19/2022 |
5 | 2 | trial_request | 4/23/2022 |
6 | 2 | conference_registration | 4/24/2022 |
7 | 3 | whitepaper_download | 4/30/2022 |
8 | 4 | trial_request | 4/30/2022 |
9 | 4 | webinar | 5/14/2022 |
Table:
Column Name | Type |
---|---|
contact_id | integer |
string |
Example Input:
contact_id | |
---|---|
1 | andy.markus@att.net |
2 | rajan.bhatt@capitalone.com |
3 | lissa_rogers@jetblue.com |
4 | kevinliu@square.com |
Example Output:
andy.markus@att.net |
You can solve this Snowflake SQL problem interactively, and get the full solution explanation.
Looking for more Snowflake SQL Interview Questions? Check out these 10 Snowflake SQL Interview Questions for access to even more questions and solutions!
The TRUNCATE statement removes all rows from a table but does not log individual row deletions, making it faster than DELETE, which removes rows one by one and logs each deletion. However, TRUNCATE cannot be rolled back, and it resets identity columns.
The Growth Team at DoorDash wants to ensure that new users, who make orders within their first 14 days on the platform, have a positive experience. However, they have noticed several issues with deliveries that result in a bad experience.
These issues include:
Write a query that calculates the bad experience rate for new users who signed up in June 2022 during their first 14 days on the platform. The output should include the percentage of bad experiences, rounded to 2 decimal places. Table:
Column Name | Type |
---|---|
order_id | integer |
customer_id | integer |
trip_id | integer |
status | string ('completed successfully', 'completed incorrectly', 'never received') |
order_timestamp | timestamp |
Example Input:
order_id | customer_id | trip_id | status | order_timestamp |
---|---|---|---|---|
727424 | 8472 | 100463 | completed successfully | 06/05/2022 09:12:00 |
242513 | 2341 | 100482 | completed incorrectly | 06/05/2022 14:40:00 |
141367 | 1314 | 100362 | completed incorrectly | 06/07/2022 15:03:00 |
582193 | 5421 | 100657 | never_received | 07/07/2022 15:22:00 |
253613 | 1314 | 100213 | completed successfully | 06/12/2022 13:43:00 |
Table:
Column Name | Type |
---|---|
dasher_id | integer |
trip_id | integer |
estimated_delivery_timestamp | timestamp |
actual_delivery_timestamp | timestamp |
Example Input:
dasher_id | trip_id | estimated_delivery_timestamp | actual_delivery_timestamp |
---|---|---|---|
101 | 100463 | 06/05/2022 09:42:00 | 06/05/2022 09:38:00 |
102 | 100482 | 06/05/2022 15:10:00 | 06/05/2022 15:46:00 |
101 | 100362 | 06/07/2022 15:33:00 | 06/07/2022 16:45:00 |
102 | 100657 | 07/07/2022 15:52:00 | - |
103 | 100213 | 06/12/2022 14:13:00 | 06/12/2022 14:10:00 |
Table:
Column Name | Type |
---|---|
customer_id | integer |
signup_timestamp | timestamp |
Example Input:
customer_id | signup_timestamp |
---|---|
8472 | 05/30/2022 00:00:00 |
2341 | 06/01/2022 00:00:00 |
1314 | 06/03/2022 00:00:00 |
1435 | 06/05/2022 00:00:00 |
5421 | 06/07/2022 00:00:00 |
Example Output:
bad_experience_pct |
---|
75.00 |
You can solve this DoorDash SQL problem interactively, and get the full solution explanation.
Looking for more DoorDash SQL Interview Questions? Check out these 8 DoorDash SQL Interview Questions for access to even more questions and solutions!
A view is a virtual table based on the result of a SELECT query, while a materialized view is a physical copy of the result set of a query that is stored and updated periodically.
The best way to prepare for the SQL interview is to practice, practice, practice. Besides solving the earlier SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like FAANG tech companies and tech startups.
Each exercise has hints to guide you, step-by-step solutions and best of all, there is an interactive coding environment so you can right online code up your SQL query and have it checked. However, if your SQL skills are weak, forget about going right into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.
To prepare for the Data Science interviews read the book Ace the Data Science Interview because it's got:
A great way to review is by checking out sql cheat sheeets! Look at this free SQL Interview Cheat Sheet and see what functions you need to nail down before the interview.