At eMemory Technology, SQL is used frequently for analyzing semiconductor manufacturing data for operational optimization. Because of this, eMemory LOVES to ask SQL query questions during interviews for Data Science, Analytics, and & Data Engineering jobs.
As such, to help you prepare for the eMemory SQL interview, we've collected 8 eMemory Technology SQL interview questions – can you solve them?
You are given a table named that contains data about the memory usage of various products for every day. Your task is to write a SQL query to calculate the average memory usage for each product for the last 7 days for each day present in the data.
day | product_id | vendor_id | usage_mb |
---|---|---|---|
2022-07-01 | 125 | 301 | 2000 |
2022-07-01 | 157 | 302 | 2500 |
2022-07-02 | 125 | 301 | 2100 |
2022-07-02 | 157 | 302 | 2600 |
2022-07-03 | 125 | 301 | 1950 |
2022-07-03 | 157 | 302 | 2550 |
2022-07-04 | 125 | 301 | 2200 |
2022-07-04 | 157 | 302 | 2700 |
2022-07-05 | 125 | 301 | 2150 |
2022-07-05 | 157 | 302 | 2750 |
This query uses a window function with to compute the average memory usage for the last 7 days for each day in the data, partitioned by . The clause specifies the window frame i.e., the rows to consider for the calculation. In this case, the frame is the current row and the 6 preceding rows - making it a total of 7 days.
For more window function practice, try this Uber SQL Interview Question on DataLemur's online SQL coding environment:
You have been given two tables, and . The table tracks the sale of memory components by and the sold, along with the . The table holds information about the , its , and its .
Your task is to write a SQL query that returns the total quantity of each product sold in the year 2021 by category.
sale_id | product_id | sale_date | quantity |
---|---|---|---|
1 | 9001 | 2021-01-15 | 150 |
2 | 9012 | 2021-03-18 | 200 |
3 | 9001 | 2021-04-10 | 120 |
4 | 9050 | 2021-06-20 | 270 |
5 | 9012 | 2021-12-15 | 400 |
product_id | name | category |
---|---|---|
9001 | Memory Chip A | Memory Chip |
9012 | Memory Chip B | Memory Chip |
9050 | Memory Card A | Memory Card |
This SQL query uses a JOIN clause to combine rows from two or more tables based on a related column between them, which in this case is . The query also uses the SUM() function to get the total quantity of each product sold. The GROUP BY statement is used to group rows that have the same values in the column. The WHERE clause, combined with the EXTRACT() function, is used to limit the results to the year 2021.
Here is an example of a clustered index on the column of a table of eMemory customer transactions:
This creates a clustered index on the column, which will determine the physical order of the data rows in the table.
Here is an example of a non-clustered index on the column of the same table:
This will create a non-clustered index on the column, which will not affect the physical order of the data rows in the table.
In terms of query performance, a clustered index is usually faster for searches that return a large number of records, while a non-clustered index is faster for searches that return a small number of records. However, updates to a clustered index are slower, as they require the data rows to be physically rearranged, while updates to a non-clustered index are faster, as they only require the index data structure to be updated.
Suppose you are a database analyst at eMemory, a company that provides a web-based application for users to store and organise digital content. For a certain period, you are tasked to find the average time a user spent on the application.
The company has a table where each entry indicates when a user opens and closes the application. Here is the table:
session_id | user_id | start_time | end_time |
---|---|---|---|
101 | 1 | 2022-07-15 08:00:00 | 2022-07-15 08:30:00 |
102 | 1 | 2022-07-15 09:00:00 | 2022-07-15 09:30:00 |
103 | 2 | 2022-07-15 08:15:00 | 2022-07-15 08:45:00 |
104 | 2 | 2022-07-15 09:00:00 | 2022-07-15 09:30:00 |
105 | 3 | 2022-07-15 08:50:00 | 2022-07-15 09:10:00 |
You should return a table with the average session times, in minutes, for each user:
user_id | avg_session_time |
---|---|
1 | 30 |
2 | 30 |
3 | 20 |
In the given SQL query, we are extracting the users and their average time spent on the web application from the table. We are using the function to get the time difference in seconds, which we then divide by 60 to convert to minutes. The function is then applied to these time differences, giving us the average session time for each user. The is used to round the output to two decimal places. The clause is used to calculate these aggregates for each distinct .
To practice a very similar question try this interactive Snapchat Sending vs. Opening Snaps Question which is similar for analyzing time spent on app usage or this Amazon Server Utilization Time Question which is similar for calculating total application run time.
A cross-join, also known as a cartesian join, is like a mad scientist's laboratory experiment gone wild. It takes two tables and mixes them together to create a crazy new table with every possible combination of rows from the original tables.
Here's an example:
If you have 20 products and 10 colors, that's 200 rows right there! Cross-joins are great for generating all possible combinations, but they can also create really big tables if you're not careful. Just like a mad scientist, use your powers wisely!
As a data analyst for eMemory, you want to analyze the success of certain digital campaigns. eMemory has been running various campaigns and you want to calculate the click-through conversion rate of these campaigns. The click-through conversion rate is the proportion of users who add a product to their cart after seeing it on a campaign.
Two tables are provided- and .
The table records details of each campaign impression (i.e., the number of times the promotional materials were displayed). The table tracks when a user adds a product to their cart.
Example Input:
impression_id | campaign_id | user_id | impression_time |
---|---|---|---|
1 | 1 | 100 | 2021-07-01 12:00:00 |
2 | 2 | 101 | 2021-07-02 13:00:00 |
3 | 1 | 102 | 2021-07-03 14:00:00 |
4 | 2 | 101 | 2021-07-04 15:00:00 |
5 | 1 | 103 | 2021-07-05 16:00:00 |
Example Input:
interaction_id | user_id | interaction_time | action |
---|---|---|---|
1 | 100 | 2021-07-01 12:30:00 | add_to_cart |
2 | 101 | 2021-07-02 14:30:00 | add_to_cart |
3 | 102 | 2021-07-03 15:00:00 | view |
4 | 101 | 2021-07-04 16:00:00 | add_to_cart |
5 | 103 | 2021-07-05 16:30:00 | view |
Calculate the click-through conversion rate for each campaign for the month of July 2021. Assume that a conversion is counted if a user adds a product to their cart within an hour of seeing a campaign.
Using PostgreSQL and assuming that we want the conversion rate to be a percentage:
This query first identifies which impressions resulted in a conversion (), and computes the total number of unique impressions (). Then, it calculates the click-through conversion rate as the number of conversions divided by the number of impressions, for each campaign. The function is used to avoid division by zero in case there were campaigns that had no impressions. A LEFT JOIN ensures that we still get data for campaigns that had impressions but no conversions.
To solve another question about calculating rates, solve this TikTok SQL Interview Question within DataLemur's online SQL code editor:
The function can take in multiple paramaters, and returns the first input paramater that is not null. If all arguments are null, the COALESCE function will return null too.
Suppose you have a table of eMemory salespeople and the number of deals they closed. If a salesperson didn't close any deals, the data from the 3rd-party CRM system exports a NULL value.
sales_person | closed_deals |
---|---|
Jason Wright | NULL |
Drew Jackson | 3 |
Chris Ho | NULL |
Adam Cohen | 2 |
Samantha Perez | 4 |
To change these NULLs to zeros, you can use the function in the following way:
This would result in the following data:
sales_person | closed_deals |
---|---|
Jason Wright | 0 |
Drew Jackson | 3 |
Chris Ho | 0 |
Adam Cohen | 2 |
Samantha Perez | 4 |
Assuming that eMemory is a company that sells various memory-related products, the question can be "What is the average price of each product sold per month?"
We will need two tables: 'sales' and 'products'. 'sales' have information about each transaction and 'products' have details about each product.
product_id | product_name | product_type |
---|---|---|
50001 | Memory Unit A | RAM |
69852 | Memory Unit B | SSD |
10012 | Memory Unit C | RAM |
98562 | Memory Unit D | HDD |
sale_id | user_id | sale_date | product_id | sale_price |
---|---|---|---|---|
1 | 123 | 2022-07-18 | 50001 | 150 |
2 | 265 | 2022-07-16 | 69852 | 200 |
3 | 362 | 2022-06-15 | 50001 | 180 |
4 | 192 | 2022-07-09 | 69852 | 210 |
5 | 981 | 2022-06-23 | 69852 | 190 |
We want to find the average sale price across the months for each product.
The above PostgreSQL query first join sales table () with products table () using the common column . This query then groups by the sale month and the product name while calculating the average of the . The result will be sorted by and then by in descending order. This will provide us with the average price of each product sold per month.
Please make sure the date format in your database matches with the format used in function. If it doesn't match, you need to adjust it accordingly.
sale_month | product_name | average_price |
---|---|---|
June | Memory Unit B | 190 |
June | Memory Unit A | 180 |
July | Memory Unit B | 205 |
July | Memory Unit A | 150 |
The key to acing a eMemory SQL interview is to practice, practice, and then practice some more! Besides solving the earlier eMemory SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Netflix, Google, and Amazon.
Each exercise has multiple hints, full answers and crucially, there is an interactive coding environment so you can instantly run your SQL query answer and have it checked.
To prep for the eMemory SQL interview you can also be useful to solve SQL questions from other tech companies like:
However, if your SQL query skills are weak, forget about jumping right into solving questions – go learn SQL with this SQL tutorial for Data Analytics.
This tutorial covers SQL topics like transforming strings with CONCAT()/LOWER()/TRIM() and Union vs. UNION ALL – both of these come up frequently in eMemory interviews.
In addition to SQL query questions, the other types of questions to practice for the eMemory Data Science Interview are:
The best way to prepare for eMemory Data Science interviews is by reading Ace the Data Science Interview. The book's got: