logo

6 REAL Amazon SQL Interview Questions (Updated 2023)

The Amazon interview process is notoriously challenging, featuring multiple SQL rounds for almost al Data Analyst, Data Science, Data Engineering, and Business Intelligence roles. And while the work-life balance at Amazon isn't to die for, and the Amazon perks lack severely compared to other FAANG companies because Bezos is a notorious cheapskate, here's the silver lining: the compensation IS good at Amazon.

Real good, once you realize how much of your salary comes in Amazon stock, which has appreciated greatly (and will continue to appreciate until Amazon is the only store in the universe).

That's why it's worth your time to practice these 6 real Amazon SQL interview questions so you can Ace the SQL interview and land your dream job at Amazon where you'll crush local small business one data-driven insight at a time!

Amazon SQL Interview Guide

6 Real Amazon SQL Interview Questions

1. Write a SQL query to get the average review ratings for every product every month.

Given the reviews table, write a query to get the average stars for each product every month.

The output should include the month in numerical value, product id, and average star rating rounded to two decimal places. Sort the output based on month followed by the product id.

Example Input:

review_iduser_idsubmit_dateproduct_idstars
617112306/08/2022 00:00:00500014
780226506/10/2022 00:00:00698524
529336206/18/2022 00:00:00500013
635219207/26/2022 00:00:00698523
451798107/05/2022 00:00:00698522

Example Output:

mthproductavg_stars
6500013.50
6698524.00
7698522.50

Before peaking at the solution, you can try this real Amazon SQL interview question online in our interactive SQL code editor:

Amazon Data Analyst SQL Interview Question: Average Review Ratings

Amazon Review Ratings Solution:

As we can see, there is no month column in the table. First, we have to extract the month from the column.

There is a simple function to extract month from a date. Here's the syntax:

You can look at this page for more explanation on the function.

After extracting the month in numerical values, get the average of the star ratings and round them to two decimal places. It can be achieved using the functions and . Please refer [1] & [2] for some reading on the functions.

Solution:


2. Amazon databases are HUGE. How do you optimize a slow SQL query?

  • SELECT fields instead of using
  • Avoid SELECT DISTINCT
  • Create joins with INNER JOIN (not WHERE)
  • Avoid JOINs in general (maybe try de-normalization)
  • Add indexes to your database
  • Examine the SQL query execution plan

Note: the interviewer will likely push you for more detail, or ask you about a real example about a time you had to make one of these optimizations yourself. While this question might be out-of-scope for Data Analysts and Data Scientists, Amazon expects people interviewing for Data Engineering and Business Intelligence roles to know how databases work internally, and best practices for database design. If you aren't familiar with these concepts, check out my article on how to prep for database design interviews.

3. What are SQL constraints, and what are example constraints?

Constraints are simply rules for what data goes into your database. Some SQL constraints are:

  • NOT NULL - Stops value from being inserted
  • UNIQUE - Ensures unique values are inserted
  • INDEX - Speeds up querying based on optimizing for a specific column that's often used to lookup records.
  • PRIMARY KEY - Uniquely identifies each record
  • FOREIGN KEY - Ensures referential integrity from on record to another in a different table

4. Write a SQL query to find the highest-grossing items.

Assume you are given the table containing information on Amazon customers and their spending on products in various categories. Identify the top two highest-grossing products within each category in 2022. Output the category, product, and total spend.

Example Input:

categoryproductuser_idspendtransaction_date
appliancerefrigerator165246.0012/26/2021 12:00:00
appliancerefrigerator123299.9903/02/2022 12:00:00
appliancewashing machine123219.8003/02/2022 12:00:00
electronicsvacuum178152.0004/05/2022 12:00:00
electronicswireless headset156249.9007/08/2022 12:00:00
electronicsvacuum145189.0007/15/2022 12:00:00

Example Output:

categoryproducttotal_spend
appliancerefrigerator299.99
appliancewashing machine219.80
electronicsvacuum341.00
electronicswireless headset249.90

Before reading the solution, why don't try the Highest-Grossing Items Amazon SQL Interview Question yourself?

Amazon SQL Interview Question: Highest Grossing Items

Amazon Highest Grossing SQL Query Solution

To find the highest-grossing products, we must find the total spend by category and product. Note that we must filter by transactions in 2022.


categoryproducttotal_spend
electronicswireless headset447.90
appliancerefrigerator299.99
appliancewashing machine439.80
electronicscomputer mouse45.00
electronicsvacuum486.66

The output represents the total spend by category (electronics, appliance) and product.

Then, we reuse the query as a CTE or subquery (in this case, we are using a CTE) and utilize the window function to calculate the ranking by total spend, partition by category and order by the total spend in descending order.


categoryproducttotal_spendranking
appliancewashing machine439.801
appliancerefrigerator299.992
electronicsvacuum486.661
electronicswireless headset447.902
electronicscomputer mouse45.003

Finally, we use this result and filter for a rank less than or equal to 2 as the question asks for top two highest-grossing products only.


5. What's the difference between and ?

Essentially is to SELECT what is to SELECT DISTINCT.

RANK() gives you the ranking within your ordered partition. Ties have the same rank, with the next ranking(s) skipped. So, if you have 4 items at rank 2, the next rank listed would be ranked 6.

DENSE_RANK() also ranks within your ordered partition, BUT the ranks are consecutive. This means no ranks are skipped if there are ranks with multiple items, and the rank order depends on your clause.

6. Amazon Orders SQL Technical Assessment

This question about Amazon orders comes from a real Amazon Data Analyst SQL assessment. It’s a multi-part SQL question, similar to how take-home SQL challenges are structured, and asks increasingly more complex questions about the amazon orders.

Amazon Orders Data

Your given an table:

ORDERS

  • order_id (composite primary key)
  • customer_id (integer)
  • order_datetime (timestamp)
  • item_id (composite primary key)
  • order_quantity (integer)

Here’s some sample data from :

order_idcustomer_idorder_datetimeitem_idorder_quantity
O-001424892023-06-15 04:35:22C0043
O-005117332023-01-12 11:48:35C0051
O-005117332023-01-12 11:48:35C0081
O-006831672023-01-16 02:52:07C0122

You are also given an table:

ITEMS

  • item_id (pimary_key)
  • item_category (string)

Here’s some sample data from :

item_iditem_category
C004Books
C005Books
C006Apparel
C007Electronics
C008Electronics

Amazon SQL Assessment Questions

  • How many units were ordered yesterday? Hint: Yesterday’s date be found via the PostgreSQL snippet
  • In the last 7 days (including today), how many units were ordered in each category? Hint: You need to consider ALL categories, even those with zero orders!
  • Write a query to get the earliest for all customer for each date they placed an order. Hint: customers can place multiple orders on a single day!
  • Write a query to find the second earliest for each customer for each date they placed two or more orders.

We’ve left the answers to this Amazon take-home challenge to the reader, but you can find a similar multi-part SQL assessment from CVS Health on Pharmacy Analytics which comes with full solutions.

How To Prep for Amazon SQL Interviews

Besides solving the Amazon SQL interview questions above, we recommend practicing the bigger list of SQL interview questions from comparable companies like Facebook and Google which ask tricky SQL problems. You can also learn what SQL interviews generally cover, and how to best prepare for them in our 6,000 word SQL interview guide.

What Else Do Amazon Data Science Interviews Cover

For data roles like data science and data analytics, Amazon interviews also ask statistics interview questions, Machine Learning interview questions, product-sense interview questions, and of course coding interview questions.

We recommend the book Ace the Data Science Interview because it has multiple Amazon Data Science Interview questions with solutions in it!

You can also expect to get asked Amazon bar-raiser behavioral interview questions, which usually cover Amazon's 14 leadership principles in-depth. You can also watch my mock Amazon behavioral interview video for help with the behavioral aspect!