At Sanmina, SQL is used often for "analyzing manufacturing processes data, and managing supply chain databases." So, it shouldn't surprise you that Sanmina frequently asks SQL query questions in interviews for Data Science, Analytics, and & Data Engineering jobs.
Thus, to help you ace the Sanmina SQL interview, we'll cover 10 Sanmina SQL interview questions – how many can you solve?
Sanmina Corporation is a worldwide electronics manufacturing services (EMS) provider. Assume you are an analyst of Sanmina, and you are required to analyze the monthly sales performance of each product in the company. Specifically, write a SQL query to provide a breakdown of the monthly average sales of each product with a record of each product's monthly sales performance.
In the sales table, each row records a sale including the sale_id, product_id, the date of the sale (sale_date), the quantity of the item sold, and the price of one item.
In PostgreSQL, we can use the function to get the year and month detail and the function to get the average. Also, you can use the clause to group the results by year, month, and product_id.
In the SQL statement above, for each unique combination of year, month, and product_id, we apply the function to the total sales (which is computed by multiplying the quantity by the price). The result provides us with the average monthly sales for each product. Lastly, we order the results by year, month, and product_id.
To solve a similar window function interview problem which uses RANK() on DataLemur's free online SQL coding environment, solve this Amazon SQL Interview Question:
As a data analyst at Sanmina, you are asked to filter down the customer records based on specific conditions. You need to retrieve all customers of Sanmina who are based in the USA, have placed at least one order in 2022, and have spent more than $5000 in total.
Please note that Sanmina operates globally, and customers could place orders multiple times in a year, each order having a different order amount.
The and tables have the following columns:
order_id | customer_id | order_date | order_amount |
---|---|---|---|
3452 | 72 | 11/25/2021 00:00:00 | 1500 |
5930 | 88 | 01/20/2022 00:00:00 | 650 |
8932 | 220 | 06/10/2022 00:00:00 | 4500 |
2401 | 72 | 03/14/2022 00:00:00 | 4000 |
6712 | 156 | 07/05/2022 00:00:00 | 100 |
customer_id | name | country |
---|---|---|
72 | John Doe | USA |
88 | Amanda Smith | USA |
220 | Wang Li | China |
156 | Mohamed Ali | Egypt |
This query first identifies all the customers who have placed orders in 2022 and their total order amount is more than $5000. It then filters out customers who are based in the USA. The result will be the and of the customers who meet all these conditions.
In SQL, a join retrieves rows from multiple tables and combines them into a single result set.
Four JOIN types can be found in SQL. For an example of each one, suppose you had a table of Sanmina orders and Sanmina customers.
INNER JOIN: Rows from both tables are retrieved when there is a match in the shared key or keys. An between the Orders and Customers tables would return only rows where the in the Orders table matches the in the Customers table.
LEFT JOIN: A retrieves all rows from the left table (in this case, the Orders table) and any matching rows from the right table (the Customers table). If there is no match in the right table, NULL values will be returned for the right table's columns.
RIGHT JOIN: A combines all rows from the right table (in this case, the Customers table) and any matching rows from the left table (the Orders table). If there is no match in the left table, NULL values will be displayed for the left table's columns.
FULL OUTER JOIN: A combines all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be displayed for the columns of the non-matching table.
Assume you are working as a Data Analyst at Sanmina, a company that provides integrated manufacturing solutions, components, products, and repair, logistics, and after-sales services. The Shipments Team has approached you for help. They would like to know the average quantity of items shipped per order to better understand the order size and resources required.
Use the and tables below for this problem.
shipment_id | order_id | product_id | quantity |
---|---|---|---|
1 | 1001 | 101 | 3 |
2 | 1001 | 102 | 2 |
3 | 1002 | 101 | 4 |
4 | 1003 | 102 | 1 |
5 | 1003 | 103 | 2 |
order_id | customer_id | order_date |
---|---|---|
1001 | 1 | 2022-06-01 |
1002 | 2 | 2022-06-02 |
1003 | 3 | 2022-06-03 |
The output should return the average quantity of items shipped per order.
order_id | avg_quantity |
---|---|
1001 | 2.5 |
1002 | 4.0 |
1003 | 1.5 |
Here is the SQL query to achieve this:
In this query, we are grouping the shipment data by each order. The AVG function is used to calculate the average quantity of items in each order. The result is a list of order id's with their corresponding average item quantities. This information helps the Shipments Team estimate the average size of each order for resource planning.
A correlated sub-query is one that is linked to the outer query and cannot be executed on its own. It uses the outer query to filter or transform data by referencing a column from the outer query, while the outer query uses the results of the inner query. On the other hand, a non-correlated sub-query is independent of the outer query and can be executed on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query.
Correlated sub-queries are slower to execute, as they have to be re-evaluated for each row of the outer query, while non-correlated sub-queries are faster, as they only have to be executed once.
Sanmina, a major provider of electronic manufacturing services, utilizes digital ads to reach potential customers. They look at the click-through rates (CTR) of each ad to evaluate their effectiveness. Given two tables and , please calculate the CTR for each ad on a daily basis.
The table logs all instances where an ad was displayed to a user. The table logs all instances where an ad was clicked by a user.
Example Input:
impression_id | ad_id | impression_date |
---|---|---|
1001 | 10 | 2022-08-01 |
1002 | 10 | 2022-08-01 |
1003 | 20 | 2022-08-02 |
1004 | 30 | 2022-08-03 |
1005 | 30 | 2022-08-03 |
Example Input:
click_id | ad_id | click_date |
---|---|---|
2001 | 10 | 2022-08-01 |
2002 | 20 | 2022-08-02 |
This query first aggregates the clicks data by ad_id and date, it then left joins this aggregated data to the impressions data (also grouped by ad_id and date), and finally calculates the CTR by dividing the count of clicks by the count of impressions. The function is used to replace any NULL values with 0, in case an ad was displayed but not clicked on a particular day.
To practice a similar SQL problem on DataLemur's free online SQL coding environment, solve this Facebook SQL Interview question:
Think of SQL constraints like the rules of a game. Just like a game needs rules to keep things fair and fun, a database needs constraints to keep things organized and accurate.
There are several types of SQL constraints like:
NOT NULL: This constraint is like a bouncer at a nightclub - it won't let anything NULL through the door. UNIQUE: This constraint is like a VIP list - only special, one-of-a-kind values get in. PRIMARY KEY: This constraint is like an elected official - it's made up of NOT NULL and UNIQUE values and helps identify each row in the table. FOREIGN KEY: This constraint is like a diplomatic ambassador - it helps establish relationships between tables. CHECK: This constraint is like a referee - it makes sure everything follows the rules. DEFAULT: This constraint is like a backup plan - it provides a default value if no other value is specified.
So, whether you're playing a game or organizing a database, constraints are an important part of the process!
Sanmina is a global electronics manufacturing services provider. Let's say they want to keep track of their performance by reviewing the average cost of production of their products every month. Our task is to find the average cost of products manufactured by Sanmina each month.
Here is an example of the data structure you would use to answer this question:
production_id | product_id | manufacturing_date | cost |
---|---|---|---|
1001 | 20001 | 06/08/2022 | 500 |
2002 | 30002 | 06/10/2022 | 250 |
3003 | 20001 | 06/15/2022 | 600 |
4004 | 30002 | 07/18/2022 | 300 |
5005 | 30002 | 07/26/2022 | 350 |
mth | product_id | avg_cost |
---|---|---|
6 | 20001 | 550 |
6 | 30002 | 250 |
7 | 30002 | 325 |
This SQL query calculates the average cost of different products for each month. It first extracts the month from the manufacturing_date column using the EXTRACT function, then groups the result by the month and product_id. Lastly, it calculates the average cost using the AVG function. The ORDER BY clause is used to sort the result by month and product_id in ascending order.
Sanmina would like to understand its customer purchasing behavior a bit better. Specifically, they would like to know, per month, who are the top 3 paying customers.
The first table, , has one row for each product purchased, including the ID of the customer who made the purchase, the product ID, the date of the purchase, and the price paid. Note that the customer could have made multiple purchases in a single day.
The second table, , has one row for each customer, including their ID and name.
Here is a sample representation of these tables:
purchase_id | customer_id | product_id | date | price_paid |
---|---|---|---|---|
1 | 1001 | 2001 | 04/01/2022 | 1000.00 |
2 | 1002 | 2002 | 04/01/2022 | 800.00 |
3 | 1003 | 2003 | 04/01/2022 | 400.00 |
4 | 1001 | 2003 | 04/02/2022 | 1000.00 |
5 | 1002 | 2001 | 04/02/2022 | 900.00 |
customer_id | name |
---|---|
1001 | John |
1002 | Emily |
1003 | David |
The task is to write a SQL query in Postgres to return the names of the three customers who have spent the most for each month.
month | name | total_amount |
---|---|---|
04 | John | 2000.00 |
04 | Emily | 1700.00 |
04 | David | 400.00 |
This query works by first calculating the total amount spent by each customer per month, then ranking customers within each month by this total amount, and finally selecting the top 3 customers for each month. The customers' names come from joining the with the table.
Since join questions come up so often during SQL interviews, take a stab at this Snapchat Join SQL question:
Stored procedures are like functions in Python – they can accept input params and return values, and are used to encapsulate complex logic.
For example, if you worked as a Data Analyst in support of the Marketing Analytics team at Sanmina, a common task might be to find the conversion rate for your ads given a specific time-frame. Instead of having to write this query over-and-over again, you could write a stored procedure like the following:
To call this stored procedure, you'd execute the following query:
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Sanmina SQL interview is to solve as many practice SQL interview questions as you can! Besides solving the above Sanmina SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Netflix, Google, and Amazon.
Each exercise has hints to guide you, full answers and crucially, there's an online SQL coding environment so you can right online code up your query and have it executed.
To prep for the Sanmina SQL interview it is also useful to solve interview questions from other tech companies like:
But if your SQL query skills are weak, forget about diving straight into solving questions – improve your SQL foundations with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers SQL concepts such as LEAD/LAG and CASE/WHEN statements – both of which pop up often during Sanmina interviews.
In addition to SQL interview questions, the other types of questions to prepare for the Sanmina Data Science Interview are:
The best way to prepare for Sanmina Data Science interviews is by reading Ace the Data Science Interview. The book's got: