At AMD, SQL is used quite frequently for analyzing performance data for processor enhancements, and managing datasets related to semiconductor manufacturing processes. They even make solutions to speed up SQL queries. Because of this, AMD frequently asks SQL coding questions in interviews for Data Analyst, Data Science, and BI jobs.
So, if you're studying for a SQL Interview, we've collected 10 AMD SQL interview questions to practice – how many can you solve?
AMD produces computer chips and related technologies for business and consumer markets. As an interviewee, you are given a dataset of reviews for AMD products.
The table contains the following columns: (integer), (integer), (timestamp), (integer), and (integer ranging from 1 to 5). is the date and time when a review was submitted. is the number of stars given by a user.
Write a SQL query to calculate the average ratings (stars) of each product, grouped by month. The results should be ordered by the month and product_id.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 00:00:00 | 50001 | 4 |
7802 | 265 | 2022-06-10 00:00:00 | 69852 | 4 |
5293 | 362 | 2022-06-18 00:00:00 | 50001 | 3 |
6352 | 192 | 2022-07-26 00:00:00 | 69852 | 3 |
4517 | 981 | 2022-07-05 00:00:00 | 69852 | 2 |
This query uses the function to get the month from the column. We then group by and the extracted month by using the clause inside the window function. This will calculate the average stars for each group of and month. Finally, we order the result by and to get the desired result.
p.s. Window functions show up super often during SQL interviews, so practice the 27+ window function questions on DataLemur
You are a data analyst at AMD, a company that designs and builds processors for computers. AMD has a variety of processors which they sell on their e-commerce platform, and they want to build a report that displays the total sales volume per product type.
Given the following and tables:
order_id | product_id | quantity | order_date |
---|---|---|---|
1001 | 2001 | 3 | 2020-01-01 |
1002 | 2002 | 2 | 2020-02-15 |
1003 | 2003 | 1 | 2020-03-01 |
1004 | 2002 | 5 | 2020-03-20 |
1005 | 2001 | 2 | 2020-04-10 |
product_id | product_name | product_type |
---|---|---|
2001 | Ryzen 7 | Desktop Processor |
2002 | Ryzen 5 | Desktop Processor |
2003 | Threadripper | High Performance Desktop Processor |
Write a SQL query that returns a table that displays the total quantity sold for each product type in the year 2020.
Given the sample input above, the query first joins the and tables on the field. Then, it only considers orders that were placed in the year 2020. Finally, it groups the results by the field and calculates the total quantity sold for each product type.
product_type | total_quantity_sold |
---|---|
Desktop Processor | 10 |
High Performance Desktop Processor | 1 |
undefined |
Denormalization is the practice of altering a database schema in a way that breaks the normalization rules (1st, 2nd, 3rd normal forms).
Denormalization is done to improve the performance of the database when it is being used for online analytics processing (OLAP), rather than online transaction processing (OLTP) use cases.
For these OLAP use cases, you're bottleneck frequently is joining multiple tables, but de-normalizing your database cuts these outs.
Denormalization can improve query performance by reducing the number of expensive joins required to retrieve data, but it comes with its own drawbacks too. By adding redundant columns, you incur more data storage costs, and there's need for more complex update and delete operations in order to maintain data integrity across all the duplicated data. Thus, it's important to carefully consider the trade-offs involved before implementing denormalization.
For AMD, they may ask you to find the average units sold per month for a specific processor category (Let's say "Ryzen") in the past year. This information will help management to understand the sales trend and make decisions for future production and inventory management. {#Question-4}
sale_id | product_category | product_id | sale_date | units_sold |
---|---|---|---|---|
1001 | Ryzen | R7-3800X | 2021-06-05 | 20 |
1023 | Ryzen | R5-5600X | 2021-07-20 | 15 |
1109 | Threadripper | TR-3990X | 2021-06-27 | 7 |
1156 | Ryzen | R7-3800X | 2021-07-28 | 12 |
1234 | Threadripper | TR-3990X | 2021-07-03 | 8 |
1290 | Ryzen | R5-5600X | 2021-07-30 | 18 |
1310 | Ryzen | R7-3800X | 2021-07-15 | 22 |
Month | avg_units_sold |
---|---|
June | 13.5 |
July | 16.8 |
Using PostgreSQL, the SQL query for the problem would be:
This SQL query uses the function to calculate the average units sold per month in the past year for 'Ryzen' category. The clause filters for records where is 'Ryzen' and the sales happened in the past year. function is used to convert the sale date to month. is used to separate the data into different groups of 'Month'. Finally, the clause orders the results by 'Month'.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for aggregating sales for products or this Wayfair Y-on-Y Growth Rate Question which is similar for analyzing sales over time.
Denormalization is the process of modifying a database schema in a way that deviates from the typical rules of normalization (1NF, 2NF, 3NF, etc.). There's a few reasons to denormalize a database:
Improved performance: Joins are slow AF, especially when dealing with the massive datasets that are typically used at AMD. Denormalization can improve the performance of a database by reducing the number of joins that are required to retrieve data. This can be particularly useful when the database is being used for querying and reporting purposes, as joins can be expensive and slow.
Scalability: By reducing the amount of data that needs to be read and processed to execute a query, denormalization can enhance the scalability of a database. This can be useful when the database is anticipated to handle a large number of read-only queries (such as in OLAP use cases).
Ease of use: Denormalization can also make it easier for users to work with a database by providing them with a more intuitive and straightforward data model.
Because denormalization can create more complex update and delete operations, and pose potential data integrity issues, consider denormalization only if joins are causing performance bottlenecks.
AMD is interested in understanding the click-through conversion rates for their digital product advertisements that lead users to add the product into their cart. Assuming we have two tables:
The first table, , contains information about users who clicked on the ads.
click_id | user_id | click_timestamp | product_id |
---|---|---|---|
101 | 17 | 2022-03-01 08:15:20 | A1 |
102 | 20 | 2022-03-01 08:46:49 | A2 |
103 | 24 | 2022-03-02 10:02:15 | A1 |
104 | 17 | 2022-03-02 14:20:37 | A3 |
105 | 20 | 2022-03-03 07:55:42 | A2 |
The second table, , contains information about users who added the product from the ad into their cart.
action_id | user_id | action_timestamp | product_id |
---|---|---|---|
501 | 17 | 2022-03-01 09:10:10 | A1 |
502 | 20 | 2022-03-01 22:46:37 | A2 |
503 | 24 | 2022-03-03 11:05:12 | A1 |
They want to know the click-through conversion rate for each product based on these actions.
The click-through conversion rate for a product is calculated as the percentage of ad clicks that result in a product being added to the cart. A same user may click the same product ad multiple times and may also add the same product to the cart multiple times.
This query first joins the table and table on the basis of and . The conversion rate is then computed as the ratio of distinct to distinct (representing the number of items added to cart and the number of ads clicked, respectively) for each product. The function is used to protect against division by zero error. The results are then sorted in descending order of conversion rate.
To solve another question about calculating rates, try this TikTok SQL Interview Question on DataLemur's online SQL code editor:
A database index is a data structure that improves the speed of data retrieval operations on a database table.
There are few different types of indexes that can be used in a database:
For a concrete example, say you had a table of AMD customer payments with the following columns:
Here's what a clustered index on the column would look like:
A clustered index on the column would determine the physical order of the records in the table based on the . This means that the records with the earliest values would be stored together physically in the table, followed by records with later payment_date values. This speeds up queries that filter or sort the data based on the , as the records are already physically sorted in the table. For example, if you want to retrieve all the payments made in the month of June, the database can use the clustered index to quickly locate and retrieve the desired records without having to search through the entire table.
Given our product and sales data, can you determine the yearly average sales prices for each of the AMD products?
product_id | product_name |
---|---|
1 | AMD Ryzen 7 5800X |
2 | AMD Radeon RX 6800 |
3 | AMD Ryzen 5 3600 |
sale_id | product_id | sale_date | sale_price |
---|---|---|---|
1 | 1 | 2022-02-05 | 329.99 |
2 | 2 | 2022-03-28 | 579.99 |
3 | 1 | 2022-05-17 | 339.99 |
4 | 3 | 2022-07-13 | 199.99 |
5 | 2 | 2022-08-25 | 599.99 |
6 | 3 | 2023-01-01 | 209.99 |
7 | 1 | 2023-02-05 | 329.99 |
8 | 2 | 2023-03-28 | 589.99 |
9 | 1 | 2023-05-17 | 339.99 |
10 | 3 | 2023-07-13 | 209.99 |
year | product_name | avg_sale_price |
---|---|---|
2022 | AMD Ryzen 7 5800X | 334.99 |
2022 | AMD Radeon RX 6800 | 589.99 |
2022 | AMD Ryzen 5 3600 | 199.99 |
2023 | AMD Ryzen 7 5800X | 334.99 |
2023 | AMD Radeon RX 6800 | 589.99 |
2023 | AMD Ryzen 5 3600 | 209.99 |
The query joins the 'sales' and 'products' tables on 'product_id' to bring the product name and sale details together. It then uses the PostgreSQL function EXTRACT to isolate the year from the sale date. It groups the data by the extracted year and product name, and calculates the average sale price of each product for each year. It rounds this average to two decimal places for readability. undefined
You are given a table for a company like AMD. This table includes a attribute that encompasses both column first name and last name, separated by a space.
The company is starting a campaign targeting customers with some specific pattern in their name. Write a SQL query that will help the marketing team find all the customers whose first name starts with 'J' and last name ends with 'son'.
customer_id | customer_name |
---|---|
125 | John Johnson |
837 | Jane Anderson |
353 | Samuel Jackson |
257 | Deborah Samuelson |
761 | Jacob Wilson |
customer_id | customer_name |
---|---|
125 | John Johnson |
761 | Jacob Wilson |
This query uses the operator to filter the customers based on the pattern provided. The '%' is a wildcard character that represents any number of characters. So 'J% son' will match any customer name that starts with 'J' and ends with 'son', and the space before 'son' ensures that 'son' is at the end of the last name, not just part of it. This will give us all customers whose first name starts with 'J' and last name ends with 'son'. undefined
For a tangible example of in PostgreSQL, suppose you were doing an HR Analytics project for AMD, and had access to AMD's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables.
You could use operator to find all contractors who never were a employee using this query:
Note that is available in PostgreSQL and SQL Server, while is the equivalent operator which is available in MySQL and Oracle (but don't worry about knowing which RDBMS supports which exact commands since AMD interviewers aren't trying to trip you up on memorizing SQL syntax).
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the AMD SQL interview is to solve as many practice SQL interview questions as you can! In addition to solving the above AMD SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Netflix, Airbnb, and Amazon.
Each DataLemur SQL question has multiple hints, step-by-step solutions and most importantly, there's an online SQL code editor so you can right online code up your query and have it graded.
To prep for the AMD SQL interview it is also wise to practice interview questions from other tech companies like:
But if your SQL query skills are weak, forget about going right into solving questions – improve your SQL foundations with this interactive SQL tutorial.
This tutorial covers things like math functions like CEIL()/FLOOR() and GROUP BY – both of these come up routinely in AMD SQL interviews.
For the AMD Data Science Interview, beyond writing SQL queries, the other types of questions to prepare for are:
The best way to prepare for AMD Data Science interviews is by reading Ace the Data Science Interview. The book's got: