At MediaTek, SQL crucial for analyzing customer data to forecast demand, and managing databases to ensure the efficient storage and retrieval of large volumes of data. Unsurprisingly this is why MediaTek LOVES to ask SQL query questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.
So, to help you prepare for the MediaTek SQL interview, here’s 8 MediaTek SQL interview questions – able to answer them all?
As a data analyst at MediaTek, you are asked to analyze the feedback received for various products over the time. From the reviews
table, you should write a SQL query to find out the average star rating of each product on a monthly basis.
Assume we have below reviews
table, where:
review_id
is the id of the reviewuser_id
is the id of the user who submitted the reviewsubmit_date
is the date when the review was submittedproduct_id
is the id of the product for which the review is givenstars
is the rating given to the product by the user, on a scale of 1-5reviews
Example Input:review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 50001 | 4 |
7802 | 265 | 2022-06-10 | 69852 | 4 |
5293 | 362 | 2022-06-18 | 50001 | 3 |
6352 | 192 | 2022-07-26 | 69852 | 3 |
4517 | 981 | 2022-07-05 | 69852 | 2 |
Your result should return a table with columns:
mth
: the month in which the review was submitted.product
: the id of the productavg_stars
: the average rating of this product in this month round off to two decimal places.mth | product | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
SELECT EXTRACT(MONTH FROM submit_date) as mth, product_id as product, ROUND(AVG(stars), 2) as avg_stars FROM reviews GROUP BY mth, product ORDER BY mth, product;
This query uses the EXTRACT(MONTH FROM submit_date)
to group the reviews by month. Then it uses the AVG(stars)
to calculate the monthly average rating for each product. The GROUP BY
clause groups the data by month and product id, and finally ORDER BY
is used to sort the result by month and product id. The entire query is within a round function to round off the average rating to two decimal places. The EXTRACTION
of the month is done because we only want to display the Month, not the complete date.
To solve another window function question on DataLemur's free interactive SQL code editor, try this Amazon BI Engineer interview question:
MediaTek, being a global fabless semiconductor company, produces a huge variety of products such as chipsets for different devices. Consider this scenario: MediaTek maintains customer purchase information in one table and product details in another table. In addition to this MediaTek maintains a table for different regions where they sold their products. MediaTek wants to analyze chipset sales data of a specific region and calculate the total sales per chipset for the current year.
Create database tables for this scenario and write a SQL query that shows the total sales per chipset for the region named "North America" for the current year.
sales
Example Input:sales_id | customer_id | product_id | region_id | sales_date | sales_amount |
---|---|---|---|---|---|
1 | 1001 | 2001 | 1 | 01/02/2022 00:00:00 | 200 |
2 | 1002 | 2002 | 2 | 01/05/2022 00:00:00 | 300 |
3 | 1001 | 2001 | 1 | 02/10/2022 00:00:00 | 250 |
4 | 1002 | 2002 | 1 | 02/05/2022 00:00:00 | 350 |
5 | 1001 | 2001 | 2 | 03/08/2022 00:00:00 | 450 |
products
Example Input:product_id | product_name | product_price |
---|---|---|
2001 | Chipset A | 50 |
2002 | Chipset B | 70 |
regions
Example Input:region_id | region_name |
---|---|
1 | North America |
2 | Asia |
SELECT p.product_id, p.product_name, SUM(s.sales_amount) AS total_sales FROM sales AS s JOIN products AS p ON s.product_id = p.product_id JOIN regions AS r ON s.region_id = r.region_id WHERE r.region_name = 'North America' AND EXTRACT(YEAR FROM s.sales_date) = EXTRACT(YEAR FROM NOW()) GROUP BY p.product_id;
This SQL query joins the three tables: sales
, products
, and regions
. It filters for sales that occurred in the current year in the "North America" region. The SUM
function is used to calculate the total sales per product (chipset). The GROUP BY
clause is used to group the sales by product. This allows the SUM
function to calculate a total for each product.
Database normalization has several benefits:
Reduces Redundancy: Normalization can minimize redundancy by breaking down a larger, general table into smaller, more granular tables. This often reduces the amount of data that needs to be accessed for particular queries, since some duplicated columns can be removed.
Improves Data Integrity: Normalization can help to ensure the integrity of the data by minimizing the risk of data inconsistencies and errors. By establishing clear relationships between the tables via primary and foreign keys, and enforcing these constraints, you can have more reliable records and relationships stored in your DB.
Improves Performance: Normalization can improve the performance of the database by reducing the amount of data that needs to be stored and accessed in a single table. This results in faster query times and better overall performance.
Given two tables, ads
and clicks
, where ads
table contains information about the different ads shown to users and the clicks
table contains information about the clicks made by users on different ads, calculate the click-through-rate (CTR) for each ad. The click-through-rate (CTR) is calculated as number of clicks/ number of ad impressions
.
ads
Sample Input:ad_id | date | impressions |
---|---|---|
101 | 12/01/2022 | 1000 |
102 | 12/01/2022 | 2000 |
103 | 12/02/2022 | 1500 |
104 | 12/02/2022 | 2500 |
clicks
Sample Input:click_id | ad_id | date | user_id |
---|---|---|---|
201 | 101 | 12/01/2022 | 123 |
202 | 101 | 12/01/2022 | 456 |
203 | 102 | 12/01/2022 | 789 |
204 | 102 | 12/01/2022 | 123 |
205 | 103 | 12/02/2022 | 456 |
206 | 103 | 12/02/2022 | 789 |
207 | 104 | 12/02/2022 | 123 |
208 | 104 | 12/02/2022 | 456 |
Here is the SQL query that would solve for CTR:
SELECT a.ad_id, a.date, ((count(c.click_id)::decimal / a.impressions::decimal) * 100) as ctr FROM ads as a JOIN clicks as c ON a.ad_id = c.ad_id AND a.date = c.date GROUP BY a.ad_id, a.date, a.impressions ORDER BY a.ad_id;
This SQL query joins the ads
and clicks
tables on ad_id
and date
, counts the number of times each ad was clicked, divides this by the number of times the ad was shown (impressions) and then multiplies by 100 to get the CTR. The result is then grouped by ad_id
and date
and ordered by ad_id
for easier reading.
To solve a similar problem on DataLemur's free interactive SQL code editor, attempt this Meta SQL interview question:
A join in SQL combines rows from two or more tables based on a shared column or set of columns. To demonstrate the difference between a LEFT JOIN
and RIGHT JOIN
, say you had a table of MediaTek orders and MediaTek customers.
LEFT JOIN: A LEFT JOIN
retrieves all rows from the left table (in this case, the Orders table) and any matching rows from the right table (the Customers table). If there is no match in the right table, NULL values will be returned for the right table's columns.
RIGHT JOIN: A RIGHT JOIN
retrieves all rows from the right table (in this case, the Customers table) and any matching rows from the left table (the Orders table). If there is no match in the left table, NULL values will be returned for the left table's columns.
As a Data Analyst at MediaTek, a leading chip designing company, your manager asks you to compute the average chip rating for each product for every month in the year 2021.
Table named chip_reviews
maintains the product review history and is formatted as follows:
chip_reviews
Example Input:review_id | user_id | submit_date | product_id | rating |
---|---|---|---|---|
1001 | 1035 | 2021-01-25 | 40001 | 4 |
1002 | 2085 | 2021-02-14 | 40001 | 5 |
1003 | 3078 | 2021-02-28 | 40001 | 4 |
1004 | 4089 | 2021-03-08 | 40002 | 3 |
1005 | 5093 | 2021-03-21 | 40002 | 2 |
1006 | 6015 | 2021-04-16 | 40003 | 5 |
1007 | 7019 | 2021-04-27 | 40003 | 4 |
1008 | 8099 | 2021-05-13 | 40003 | 3 |
The chip_reviews
table includes columns:
review_id
: The review's unique identifieruser_id
: The user's unique identifier who gave the reviewsubmit_date
: The date when the review was submitted (format is 'YYYY-MM-DD')product_id
: The product's unique identifierrating
: Rating given by the user to the product (range is from 1 to 5)Return a table with Month, Product ID and the Average Rating for that product in that particular month.
chip_reviews
Example Output:Month | Product | Avg Rating |
---|---|---|
01 | 40001 | 4.00 |
02 | 40001 | 4.50 |
03 | 40002 | 2.50 |
04 | 40003 | 4.50 |
05 | 40003 | 3.00 |
Here is the SQL code that would solve the problem:
SELECT TO_CHAR(submit_date, 'MM') AS Month, product_id AS Product, ROUND(AVG(rating), 2) AS "Avg Rating" FROM chip_reviews WHERE submit_date >= '2021-01-01' AND submit_date < '2022-01-01' GROUP BY Month, Product ORDER BY Month, Product;
This SQL query first filters out the reviews from 2021. The TO_CHAR function is used to extract the 'Month' part from the 'submit_date'. The AVG function is used to calculate average ratings for each 'product_id' for each month. The ROUND function is used to round-off the average rating to 2 decimal places. The results are then ordered by 'Month' and 'Product' in ascending order.
DISTINCT
do?The DISTINCT
keyword removes duplicates from a SELECT
query.
Suppose you had a table of MediaTek customers, and wanted to figure out which cities the customers lived in, but didn't want duplicate results.
mediatek_customers
table:
name | city |
---|---|
Akash | SF |
Brittany | NYC |
Carlos | NYC |
Diego | Seattle |
Eva | SF |
Faye | Seattle |
You could write a query like this to filter out the repeated cities:
SELECT DISTINCT city FROM mediatek_customers;
Your result would be:
city |
---|
SF |
NYC |
Seattle |
Given a list of MediaTek's customer records, can you write a SQL query to find all the customer records who are based in North America and their company name contains the term 'Tech'?
For this problem, consider the following customers
table structure:
customers
Example Input:|**customer_id**|**customer_name**|**region**|**creation_date**|
|:----|:----|:----|:----|
|5689|BigTech Labs|North America|08/05/2018|
|4132|FutureTech Inc|North America|12/21/2019|
|9564|Global Innovations|Asia|03/11/2020|
|6812|TechEvolution|Europe|06/25/2016|
|3456|ProgressiveTech|North America|01/15/2017|
In this case, you are expected to filter out the customers that are based in North America and their names include the pattern 'Tech'.
SELECT * FROM customers WHERE region LIKE 'North America' AND customer_name LIKE '%Tech%';
This SQL query uses the LIKE keyword to filter customers based in North America and whose names contain the string 'Tech'. The '%' character is a wildcard character that matches any sequence of characters. By placing it before and after 'Tech', the query will match any customer_name where 'Tech' appears anywhere in the string.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the MediaTek SQL interview is to solve as many practice SQL interview questions as you can!
In addition to solving the above MediaTek SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Netflix, Airbnb, and Amazon.
Each exercise has hints to guide you, detailed solutions and best of all, there is an interactive SQL code editor so you can easily right in the browser your query and have it executed.
To prep for the MediaTek SQL interview you can also be wise to practice SQL questions from other tech companies like:
In case your SQL skills are weak, forget about diving straight into solving questions – go learn SQL with this SQL tutorial for Data Analytics.
This tutorial covers things like handling strings and handling timestamps – both of which pop up frequently during MediaTek SQL interviews.
In addition to SQL interview questions, the other topics tested in the MediaTek Data Science Interview are:
The best way to prepare for MediaTek Data Science interviews is by reading Ace the Data Science Interview. The book's got: