At Tokyo Electron, SQL is used often for analyzing semiconductor manufacturing data for optimization, and managing quality control data for improved device production. Because of this, Tokyo Electron almost always evaluates jobseekers on SQL query questions during interviews for Data Analytics, Data Science, and Data Engineering jobs.
So, if you're preparing for a SQL Interview, we've collected 9 Tokyo Electron SQL interview questions to practice, which are similar to commonly asked questions at Tokyo Electron – can you solve them?
Tokyo Electron is a global leader in the development and manufacturing of semiconductors. Given a table production_history
consisting of records of different semiconductor types produced, their production dates, and count of items produced, write a SQL query to calculate a running monthly total for each semiconductor type.
Assume our table, production_history
, looks like this:
production_history
Example Input:production_id | product_type | production_date | product_count |
---|---|---|---|
1 | A | 2022-03-01 | 150 |
2 | A | 2022-03-15 | 200 |
3 | B | 2022-03-15 | 100 |
4 | A | 2022-04-01 | 150 |
5 | B | 2022-04-10 | 200 |
The expected output is a table that shows a running sum for each product type, calculated monthly. For instance, the March running total for product_type A is 350 (which is 150+200), and for product_type B is 100.
month | product_type | running_total |
---|---|---|
03 | A | 350 |
03 | B | 100 |
04 | A | 500 |
04 | B | 300 |
Here is the SQL query that would provide the answer:
SELECT TO_CHAR(production_date, 'MM') AS month, product_type, SUM(product_count) OVER (PARTITION BY product_type ORDER BY TO_CHAR(production_date, 'MM')) AS running_total FROM production_history
The SQL query applies PostgreSQL window functions. The window function SUM(product_count)
provides a sum of products manufactured for each product type. The PARTITION BY
clause is used to calculate a separate sum for each product type. The ORDER BY
clause, in conjunction with the window function, calculates a running total for each subsequent month.
The TO_CHAR
function is used to format the production_date
as a month ('MM').
To practice another window function question on DataLemur's free interactive SQL code editor, try this Amazon SQL Interview Question:
Tokyo Electron is a leading global manufacturer of semiconductor production equipment. Suppose you are given two tables: one describes the semiconductors produced (semiconductors
) and the other describes the production machines (machines
).
Performance of the production machines significantly affects the quality of the semiconductors produced. Your task is to determine the average quality rating for semiconductors produced by each machine.
semiconductors
Example Input:semiconductor_id | machine_id | production_date | quality_rating |
---|---|---|---|
111 | 333 | 06/08/2022 | 3.9 |
122 | 444 | 06/10/2022 | 4.2 |
133 | 555 | 06/15/2022 | 3.8 |
144 | 333 | 07/26/2022 | 3.7 |
155 | 444 | 07/05/2022 | 4.0 |
machines
Example Input:machine_id | manufacturer | model_number |
---|---|---|
333 | Tokyo Electron | AXI930 |
444 | Tokyo Electron | FXI800 |
555 | Tokyo Electron | EXI500 |
The semiconductors
table includes fields for a unique semiconductor ID, the ID of the machine that produced it, the production date, and the quality rating given to the semiconductor (on a scale of 1.0 to 5.0).
The machines
table includes fields for a unique machine ID, the manufacturer of the machine, and the model number of the machine.
To answer this question, you can join the semiconductors
and machines
tables on machine_id
and calculate the average quality rating for each machine.
SELECT m.manufacturer, m.model_number, AVG(s.quality_rating) AS avg_quality_rating FROM semiconductors s JOIN machines m ON m.machine_id = s.machine_id GROUP BY m.manufacturer, m.model_number;
manufacturer | model_number | avg_quality_rating |
---|---|---|
Tokyo Electron | AXI930 | 3.8 |
Tokyo Electron | FXI800 | 4.1 |
Tokyo Electron | EXI500 | 3.8 |
This SQL query groups by manufacturer
and model_number
to give us the average quality_rating
for each machine.
undefined
The BETWEEN
operator is used to select rows that fall within a certain range of values, while the IN
operator is used to select rows that match values in a specified list.
For example, suppose you are a data analyst at Tokyo Electron and have a table of advertising campaign data. To find campaigns with a spend between 5k, you could use BETWEEN:
SELECT * FROM tokyo_electron_ad_campaigns WHERE spend BETWEEN 1000 AND 5000;
To find advertising campaigns that were video and image based (as opposed to text or billboard ads), you could use the IN
operator:
SELECT * FROM tokyo_electron_ad_campaigns WHERE media_type IN ("video", "image");
Tokyo Electron Ltd. is a leading global supplier of semiconductor and flat panel display (FPD) production equipment. Their machines often use different noble gases such as Helium, Neon, or Argon. For the management of these resources, you are given a table named "usage" which includes the information of the noble gas used, the year, and the agreement rate. The agreement rate is the percentage of time the machines effectively made use of the noble gases supplied to them.
The "usage" table has the following schema:
usage
Example Input:
usage_id | year | noble_gas | agreement_rate(%) |
---|---|---|---|
1 | 2022 | Helium | 95 |
2 | 2022 | Argon | 85 |
3 | 2021 | Neon | 89 |
4 | 2021 | Helium | 92 |
5 | 2022 | Neon | 88 |
The task is to write a SQL query that calculates the average agreement rate per year for each noble gas. The result should be rounded to two decimal places. Filter out any averages that are below 90%.
Use the example usage
input and simulate the desired output:
Example Output:
year | noble_gas | avg_agreement_rate(%) |
---|---|---|
2022 | Helium | 95.00 |
2021 | Helium | 92.00 |
SELECT year, noble_gas, round(avg(agreement_rate), 2) AS avg_agreement_rate FROM usage GROUP BY year, noble_gas HAVING avg(agreement_rate) >= 90;
This query first groups the usage
table by the year
and the noble_gas
columns. Afterwards, it calculates the average agreement_rate
for each group. The HAVING keyword is then used to filter out the groups where the average agreement rate is below 90%. Finally, the result is rounded to two decimal places for readability.
undefined
Database normalization is the process of breaking down a table into smaller and more specific tables and defining relationships between them via foreign keys. This minimizes redundancy, and creates a database that's more flexible, scalable, and easier to maintain. It also helps to ensure the integrity of the data by minimizing the risk of data inconsistencies and anomalies.
Tokyo Electron is a leading global company in the electronics and semiconductor industry. Suppose you are given a table of customer data. Can you write a SQL query to find the email addresses of all customers whose addresses contain the string "Tokyo"?
customer_data
Example Input:customer_id | first_name | last_name | address | |
---|---|---|---|---|
1 | John | Doe | john.doe@example.com | 123 Tokyo St, Tokyo |
2 | Jane | Smith | jane.smith@example.com | 456 Osaka St, Osaka |
3 | Jake | Williams | jake.williams@example.com | 789 Kyoto Ave, Kyoto |
4 | Jim | Brown | jim.brown@example.com | 321 Nihonbashi, Tokyo |
5 | Jill | Johnson | jill.johnson@example.com | 654 Umeda, Osaka |
SELECT email FROM customer_data WHERE address LIKE '%Tokyo%'
This SQL query uses the LIKE keyword in combination with '%' as a wildcard to search for any customers whose address contains the string "Tokyo". It will return the email addresses of customers John Doe and Jim Brown, as they are the ones with "Tokyo" in their addresses. undefined
CHECK
constraint function, and in what scenarios might it be useful?The CHECK constraint is used to enforce rules on the data in a specific column. If a row is inserted or updated with data that does not follow the CHECK constraint's rule, the operation will fail.
For example, say you had a marketing analytics database that stores ad campaign data from Tokyo Electron's Google Analytics account.
Here's what some constraints could look like:
CREATE TABLE ad_campaigns ( ad_id INTEGER PRIMARY KEY, ad_name VARCHAR(128) NOT NULL, start_date DATE NOT NULL, end_date DATE NOT NULL, budget DECIMAL(12,2) NOT NULL CHECK (budget > 0), cost_per_click DECIMAL(12,2) NOT NULL CHECK (cost_per_click > 0) );
The CHECK constraint is used in the above example to make sure that the "budget" and "cost_per_click" fields contain only positive values. This helps to maintain the validity and integrity of the data in the context of ad campaigns.
The CHECK constraint can also be used to enforce other specific conditions on data. For example, you can use it to ensure that the "start_date" for each ad campaign is earlier than the "end_date."
Tokyo Electron has a database containing Sales and Customer information. The sales
table includes fields such as sale_id
, product_id
, price
, and sale_date
. The customers
table includes customer_id
, first_name
, last_name
, and email
.
Write a SQL query that lists each customer with the total number of unique products they have purchased, the total amount spent, and their personal information. Only include customers who have made at least one purchase.
sales
Example Input:sale_id | customer_id | product_id | price | sale_date |
---|---|---|---|---|
1 | 100 | 101 | 200 | 2022-01-10 |
2 | 100 | 102 | 300 | 2022-02-20 |
3 | 101 | 101 | 200 | 2022-02-25 |
4 | 102 | 103 | 400 | 2022-03-10 |
5 | 100 | 101 | 200 | 2022-03-15 |
customers
Example Input:customer_id | first_name | last_name | |
---|---|---|---|
100 | John | Doe | john.doe@example.com |
101 | Jane | Smith | jane.smith@example.com |
102 | Bob | Marley | bob.marley@example.com |
SELECT c.first_name, c.last_name, c.email, COUNT(DISTINCT s.product_id) AS total_products, SUM(s.price) AS total_spent FROM sales s JOIN customers c ON s.customer_id = c.customer_id GROUP BY c.customer_id HAVING COUNT(DISTINCT s.product_id) > 0;
This query first joins the sales
and customers
tables using the customer_id
field. Then, for each customer, it counts the number of unique products (COUNT(DISTINCT s.product_id)
) and sums the sale prices (SUM(s.price)
). The HAVING
clause ensures we only include customers who have purchased at least one product.
Since joins come up frequently during SQL interviews, try an interactive Spotify JOIN SQL question:
Given a database that stores maintenance scheduled for each production equipment of Tokyo Electron, find out the number of maintenance sessions by month and also the average downtime for each equipment due to maintenance.
equipment_maintenance
Example Input:maintenance_id | equipment_id | start_time | end_time |
---|---|---|---|
1001 | E1 | 01/29/2021 09:00:00 | 01/29/2021 12:00:00 |
1002 | E2 | 02/15/2021 13:00:00 | 02/15/2021 16:00:00 |
1003 | E1 | 03/02/2021 10:00:00 | 03/02/2021 14:00:00 |
1004 | E2 | 03/22/2021 11:00:00 | 03/22/2021 14:00:00 |
1005 | E1 | 04/05/2021 09:00:00 | 04/05/2021 12:00:00 |
month | equipment_id | maintenance_count | avg_downtime_hours |
---|---|---|---|
1 | E1 | 1 | 3.00 |
2 | E2 | 1 | 3.00 |
3 | E1 | 1 | 4.00 |
3 | E2 | 1 | 3.00 |
4 | E1 | 1 | 3.00 |
SELECT EXTRACT(MONTH FROM start_time) AS month, equipment_id, COUNT(*) AS maintenance_count, AVG(EXTRACT(EPOCH FROM (end_time-start_time))/3600) as avg_downtime_hours FROM equipment_maintenance GROUP BY month,equipment_id ORDER BY month,equipment_id;
This query uses the EXTRACT function to isolate the month from the start_time
timestamp, and groups by equipment_id
and that derived month. The query counts the number of maintenances for each group to get the monthly maintenance sessions (maintenance_count).
To calculate the average downtime (avg_downtime_hours), it subtracts start_time
from end_time
which gives an interval, and the EXTRACT function with the EPOCH keyword is used to convert that interval to seconds. The value is then divided by 3600 to convert the seconds to hours. The AVG function is used to calculate the average of these hours. The results are ordered by month
and equipment_id
to present a clearer view of the information.
undefined
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Tokyo Electron SQL interview is to solve as many practice SQL interview questions as you can!
Beyond just solving the above Tokyo Electron SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Microsoft, Google, and Facebook.
Each exercise has multiple hints, step-by-step solutions and crucially, there is an interactive SQL code editor so you can easily right in the browser your SQL query answer and have it executed.
To prep for the Tokyo Electron SQL interview it is also a great idea to solve SQL questions from other tech companies like:
But if your SQL skills are weak, forget about jumping right into solving questions – go learn SQL with this interactive SQL tutorial.
This tutorial covers topics including using LIKE and finding NULLs – both of these come up frequently in Tokyo Electron interviews.
For the Tokyo Electron Data Science Interview, in addition to SQL query questions, the other types of questions which are covered:
To prepare for Tokyo Electron Data Science interviews read the book Ace the Data Science Interview because it's got: