At TSMC, SQL is used day-to-day for analyzing semiconductor production data and enhancing wafer yield prediction models, as part of their intelligent foundry initiative.
Because of this, TSMC almost always evaluates jobseekers on SQL query questions during interviews for Data Analytics, Data Science, and Data Engineering jobs. To help you prepare, we've curated 8 TSMC SQL interview questions to practice – able to answer them all?
TSMC is a semiconductor manufacturing company. They distinguish their high-value customers, or "power" users, as those who have large volume orders very frequently.
Consider the database containing two tables: and .
The table keeps track of all orders with their volumes of chips ordered, order date, and the customer_id.
Also, there's the table, which contains details about the customers, including the customer_id, customer name, and their company name.
order_id | customer_id | order_date | volume |
---|---|---|---|
123 | 1 | 01/01/2022 00:00:00 | 5000 |
124 | 2 | 01/03/2022 00:00:00 | 2000 |
125 | 1 | 01/05/2022 00:00:00 | 6000 |
126 | 3 | 01/05/2022 00:00:00 | 3000 |
127 | 1 | 01/07/2022 00:00:00 | 7000 |
customer_id | customer_name | company_name |
---|---|---|
1 | John Doe | CompanyA |
2 | Jane Doe | CompanyB |
3 | Richard Roe | CompanyC |
The problem is to write a SQL query that will identify the customers who have a total volume of 10,000 or more in the last 30 days.
You should return the customer_name, company_name, and total volume ordered in the past 30 days.
The query joins the and tables, sums the volume of chips ordered in the last 30 days and groups by customer name and company name. It finally restricts the selection to those customers who have a total volume equal to or greater than 10,000.
To work on another SQL customer analytics question where you can code right in the browser and have your SQL code automatically checked, try this Walmart Labs SQL Interview Question:
You are provided with the data of daily chip production at TSMC which consists of the , and . The task is to write a SQL query that calculates the monthly average quantity produced for each chip type using a SQL window function.
production_id | chip_id | production_date | quantity_produced |
---|---|---|---|
1 | C1 | 2022-03-01 | 3000 |
2 | C2 | 2022-03-01 | 4500 |
3 | C1 | 2022-03-02 | 3200 |
4 | C2 | 2022-03-03 | 5000 |
5 | C1 | 2022-04-01 | 3600 |
6 | C2 | 2022-04-02 | 4700 |
mth | chip | avg_quantity_produced |
---|---|---|
3 | C1 | 3100 |
3 | C2 | 4750 |
4 | C1 | 3600 |
4 | C2 | 4700 |
In this PostgreSQL query, we're using the window function to calculate the average quantity produced for each chip type for each month. The clause is used to divide the data into partitions by the and the month of the . is used to truncate the date to month level. The result is ordered by month and chip type.
To practice a related window function SQL problem on DataLemur's free interactive coding environment, try this Amazon SQL question asked in a BI Engineer interview:
In database schema design, a one-to-one relationship is when each entity is associated with only one instance of the other. For instance, a US citizen's relationship with their social-security number (SSN) is one-to-one because each citizen can only have one SSN, and each SSN belongs to one person.
A one-to-many relationship, on the other hand, is when one entity can be associated with multiple instances of the other entity. An example of this is the relationship between a person and their email addresses - one person can have multiple email addresses, but each email address only belongs to one person.
TSMC (Taiwan Semiconductor Manufacturing Company) is the world's largest dedicated independent (pure-play) semiconductor foundry. Let's imagine that TSMC is facing a business problem where management needs insight from raw production data. Specifically, they want to detect if there are any products whose production quantity in a month drops by more than 25% from the previous month. This may indicate an issue in the production line that needs to be addressed.
Therefore, you have been tasked to create an SQL query that will detect such a drop in monthly production. You are given the table where each row represents the quantity of a product produced each day.
id | product_id | production_date | quantity |
---|---|---|---|
1 | 100 | 2022-06-01 | 200 |
2 | 100 | 2022-06-02 | 220 |
3 | 100 | 2022-06-30 | 150 |
4 | 100 | 2022-07-01 | 100 |
5 | 100 | 2022-07-31 | 80 |
6 | 200 | 2022-06-01 | 500 |
7 | 200 | 2022-06-30 | 520 |
8 | 200 | 2022-07-01 | 490 |
9 | 200 | 2022-07-31 | 510 |
With the above query, we created two CTEs (Common Table Expressions or temporary views): and . In , we aggregated the total amount of each product produced in each month. Then in , we compared the current month's total quantity to the previous month's total quantity for each product. Finally, we selected the results where the percent change in total quantity was less than -25%, i.e., a drop by more than 25%.
One way to find duplicatesis to use a clause and then use to find groups
You could also use the operator:
You are a data analyst at TSMC and your marketing team wants to determine the effectiveness of several digital ad campaigns. They are particularly interested in understanding the click-through rates for each campaign. A click-through rate (CTR) is the ratio of users who clicked on an ad to the number of total users who viewed the ad.
Consider the following tables and .
ad_id | view_date | user_id |
---|---|---|
1 | 2022-06-08 | 1001 |
2 | 2022-06-10 | 2001 |
1 | 2022-07-18 | 3001 |
2 | 2022-07-26 | 4001 |
1 | 2022-07-05 | 5001 |
ad_id | click_date | user_id |
---|---|---|
1 | 2022-06-08 | 1001 |
1 | 2022-07-18 | 3001 |
2 | 2022-07-26 | 4001 |
You will need to join the two tables on ad_id, aggregate on a per-ad basis and then divide the count of clicks by the count of views.
The SQL query would be:
This will return the count of views, the count of clicks and the click-through rate (ctr) for each ad_id.
Please note that the type conversion to float is necessary for the division as otherwise integer division would return 0's for everything but 100% CTR.
To practice a similar SQL interview question on DataLemur's free interactive coding environment, solve this Facebook SQL Interview question:
If you have two tables and want to retrieve only the rows that are present in both tables, just use the operator!
For example, let's use to find all of TSMC's Facebook video ads that are also being run on YouTube:
As an analyst for TSMC, a global semiconductor company, you have been given two tables - a 'Customers' table and a 'Purchases' table. The Customers table has a row for each customer with the customer's id, name, and the country they are from. The Purchases table has a row for each purchase with the id of the customer who made the purchase, the id of the item they purchased, and the date of the purchase.
Write a SQL query that gives each customer's name, the total number of purchases they have made, and the names of the unique countries from which customers have purchased items.
customer_id | name | country |
---|---|---|
10000 | John | USA |
10001 | Emma | Canada |
10002 | Gary | Taiwan |
purchase_id | customer_id | item_id | purchase_date |
---|---|---|---|
21000 | 10000 | 65000 | 2022-08-01 |
21001 | 10000 | 65002 | 2022-08-03 |
21002 | 10001 | 65000 | 2022-08-05 |
The SQL query is as follows:
This query joins the Customers and Purchases tables on the 'customer_id' field, counts the total number of purchases for each customer and groups the results by customer name and country to give each customer's name, their total number of purchases, and the names of the unique countries from which customers have purchased items.
Since joins come up so often during SQL interviews, try this interactive Snapchat JOIN SQL interview question:
The key to acing a TSMC SQL interview is to practice, practice, and then practice some more! In addition to solving the earlier TSMC SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like FAANG and tech startups.
Each SQL question has multiple hints, full answers and crucially, there's an online SQL coding environment so you can right online code up your query and have it executed.
To prep for the TSMC SQL interview it is also wise to practice SQL questions from other semiconductor companies like:
However, if your SQL query skills are weak, forget about going right into solving questions – improve your SQL foundations with this DataLemur SQL tutorial.
This tutorial covers topics including filtering data with WHERE and math functions like ROUND()/CEIL() – both of which pop up frequently in TSMC SQL assessments.
For the TSMC Data Science Interview, besides SQL questions, the other types of questions which are covered:
To prepare for TSMC Data Science interviews read the book Ace the Data Science Interview because it's got:
You should also learn how TSMC is poised to leverage AI, so you can speak more intellignetly to how TSMC and Data Science/AI are related.