logo

10 Coupang SQL Interview Questions (Updated 2024)

Updated on

February 14, 2024

At Coupang, SQL is used across the company for analyzing customer purchasing patterns for targeted marketing campaigns and optimizing inventory management by predicting product demand. That's why Coupang almost always evaluates jobseekers on SQL coding questions during interviews for Data Analytics, Data Science, and Data Engineering jobs.

To help you prepare for the Coupang SQL interview, we've collected 10 Coupang SQL interview questions – able to answer them all?

10 Coupang SQL Interview Questions

SQL Question 1: Calculate monthly average review score per product

At Coupang, customer feedback via product reviews is an essential indicator of product quality and customer satisfaction. Your task is to write a SQL query to calculate the monthly average review score for each product. You will use the table named , which has the following columns:

  • review_id: A unique identifier for each review
  • user_id: The unique identifier of the user who left the review
  • submit_date: The date when the review was submitted
  • product_id: The identifier for the product
  • stars: The number of stars given for the product in the review.

Sample table:

review_iduser_idsubmit_dateproduct_idstars
61711232022-06-08500014
78022652022-06-10698524
52933622022-06-18500013
63521922022-07-26698523
45179812022-07-05698522

Answer:


This PostgreSQL query groups reviews by month of submission and product_id, then applies the AVG function to compute the average stars for each month-product group. The result is ordered by month and product_id.

SQL Question 2: Filter Customers Based on Purchase History and Location

For Coupang, an important task might be to identify customers who have made purchases in certain product categories and who live in particular regions. This information can be used for targeted promotions, personalized recommendations and more. Imagine the company has a specific promotion for customers in the Seoul region who have purchased electronics in the past.

Given two tables, and , write a SQL query that returns the customer_id, customer_name, and region for customers who live in Seoul and have purchased electronics.

The table has the following structure:

Example Input:
customer_idcustomer_nameregion
1John DoeSeoul
2Jane SmithBusan
3Will KimDaegu
4Sarah ParkSeoul

The table has the following structure:

Example Input:
purchase_idcustomer_idproduct_category
50011electronics
50022books
50033electronics
50044health & beauty
50051books

Answer:


This PostgreSQL query joins the and tables on the field. It then filters for customers in the Seoul region who have made a purchase in the electronics product category. Such a SQL query could be helpful in the scenario described above, where targeted promotions for certain regions and product categories are being planned.

SQL Question 3: What are the different normal forms (NF)?

Normalization is the process of organizing fields and tables of a database to minimize redundancy and dependency. While there are technically 5 levels (normal forms), the 3 most important normal forms you need to know about for SQL interviews at Coupang are:

  1. First Normal Form (1NF): This should fix remove a table's duplicate columns. Also, each column should contain only a single value (no lists or containers of data), and finally each row of table should have a unique identifier as well.
  2. Second Normal Form (2NF): A table is in its second normal form if it meets all requirements of the first normal form and places the subsets of columns in separate tables. The relationships between tables are created using primary/foreign keys.
  3. Third Normal Form (3NF): The table should be in the second normal form. There should be no dependency on another non-key attribute (meaning a primary key should be the only thing required to identify the row).

Coupang SQL Interview Questions

SQL Question 4: Analyzing Click-through Conversion Rates

Coupang, a large e-commerce company, runs various ads campaign to boost its sales. Users click on these ads, view the products and then may add those products into their cart. The company wants to measure the efficiency of these campaigns by analyzing their click-through conversion rates, specifically from viewing a product to adding that product to the cart.

Auxiliary tables have been created to help in this analysis. The table contains a record each time a user clicks on an ad. The table records an entry each time a user views a product from the ad click. The table records an entry whenever a product is added to the cart.

Example Input:
ad_iduser_idclicked_atproduct_id
10011012022-06-01 10:05:455001
10021022022-06-01 11:15:365002
10031012022-06-01 12:45:505001
10041022022-06-01 13:20:455002
Example Input:
view_iduser_idviewed_atproduct_id
20101012022-06-01 10:10:455001
20201022022-06-01 11:20:365002
Example Input:
addition_iduser_idadded_atproduct_id
40011012022-06-01 12:05:455001
40021022022-06-01 14:20:365002

The task is to write a SQL query that will give the number of ads clicked, products viewed, and products added to the cart for each product.

Answer:


The above query first joins with on and to get number of ad clicks and product views for each product. It then uses a subquery to count the number of additions for each product from table. This count function in the subquery and the main query helps in counting the number of distinct ads clicked, products viewed and added to the cart. The COALESCE function ensures that if there are no cart additions for a given product_id, the num_products_added is returned as 0.

To practice a similar problem about calculating rates, try this TikTok SQL question on DataLemur's online SQL coding environment: SQL interview question from TikTok

SQL Question 5: How does an inner join differ from a full outer join?

An inner join only includes rows from both tables that satisfy the join condition, whereas a full outer join includes all rows from both tables, regardless of whether they match the join condition or not.

For example, suppose you had a table of Coupang orders and Coupang customers.

INNER JOIN: When there is a match in the shared key or keys, rows from both tables are retrieved. In this example, an between the Orders and Customers tables would retrieve rows where the in the Orders table matches the in the Customers table.

FULL OUTER JOIN: A retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.

SQL Question 6: Average Daily Sales of Products

As a data analyst for Coupang, we have a table named 'sales' that logs every sale made on our platform. Each row represents a transaction with columns for the transaction_id, product_id, customer_id, amount, and transaction_date.

In an effort to understand our daily sales performance better, we want to write a query that gives the average amount of sales of each product on each day.

Example Input:
transaction_idcustomer_idtransaction_dateproduct_idamount
10011112020-10-200001100
10022222020-10-200002150
10031112020-10-21000150
10042222020-10-230002200
10053332020-10-250001150
Example Output:
transaction_dateproduct_idavg_sales
2020-10-200001100.00
2020-10-200002150.00
2020-10-21000150.00
2020-10-230002200.00
2020-10-250001150.00

Answer:


This query groups the sales data first by the transaction_date then the product_id. The AVG aggregate function is then used to compute the average amount of sales for each product on each day.

SQL Question 7: What is a foreign key?

A foreign key is a column or group of columns in a table that refers to the primary key in another table. The foreign key constraint helps maintain referential integrity between the two tables.

Let's examine employee data from Coupang's HR database:

:

+------------+------------+------------+------------+ | employee_id| first_name | last_name | manager_id | +------------+------------+------------+------------+ | 1 | Aubrey | Graham | 3 | | 2 | Marshal | Mathers | 3 | | 3 | Dwayne | Carter | 4 | | 4 | Shawn | Carter | | +------------+------------+------------+------------+

In this table, serves as the primary key and functions as a foreign key because it links to the of the employee's manager. This establishes a relationship between Coupang employees and their managers, allowing for easy querying to find an employee's manager or see which employees report to a specific manager.

The table may also have multiple foreign keys that reference primary keys in other tables. For example, and foreign keys could be used to connect each employee to their respective department and location.

SQL Question 8: Find Customers With Unusual Names

Suppose Coupang is interested in discovering more about their customer base. A unique marketing initiative is being launched to reach out to customers with unusual names. For this purpose, they are interested in finding any customer whose name contains the word 'star'.

Provide a list of customer IDs and names from their customer database who satisfy this criterion.

Example Input:
customer_idname
1John Doe
2Starla Bright
3Sarah Star
4Michael Jones
5Starson Richards
Example Output:
customer_idname
2Starla Bright
3Sarah Star
5Starson Richards

Answer:


In this query, we're making use of the keyword in SQL and the wildcard character. The character is used to represent any number of characters (including zero characters). Therefore, will match any string that contains 'star' at any position.

This query searches through the column in the table, and selects the and for any name that contains the string 'star'. The names are not case-sensitive. If you need the query to be case-sensitive, use instead of .

SQL Question 9: Calculate Discount Impact on Sales

In Coupang, the Pricing team would like to understand the impact of discount on sales. They have a table named that records the original price, the discount applied to it (if any), and the quantity sold for each product for each day. They would like you to calculate the total revenue without discount and with discount, then calculate the percentage of revenue retained after applying the discount. Round off the percentage to 2 decimal places.

Example Input:
dateproduct_idoriginal_pricediscountquantity_sold
2022-06-010011002050
2022-06-020011003075
2022-06-030011001025
2022-06-0100220010100
2022-06-0200220020150
2022-06-030022003050
Example Output:
product_idrevenue_without_discountrevenue_with_discountpercentage_revenue_retained
00112500.009900.0079.20
00258000.0046000.0079.31

Answer:


The above SQL block groups the data by product_id. For each product, it calculates the total revenue without discount by multiplying the original price per unit with the quantity sold. It then calculates the revenue after applying the discount by multiplying the original price per unit (after adjusting for discount) with the quantity sold. The percentage of revenue retained is then calculated by dividing the revenue after discount by the revenue before discount and multiplied by 100. The ROUND() function is used to limit the result to 2 decimal places.

To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating top grossing items> or this Wayfair Y-on-Y Growth Rate Question which is similar for calculating yearly growth rate sales.

SQL Question 10: How does and differ?

The clause is used to filter the groups created by the clause. It's similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.

For example, say you were analyzing salaries for analytics employees at Coupang:


This query retrieves the total salary for each Analytics department at Coupang and groups the rows by the specific department (i.e. "Marketing Analytics", "Business Analytics", "Sales Analytics" teams).

The clause then filters the groups to include only Coupang departments where the total salary is greater than $1 million

Coupang SQL Interview Tips

Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Coupang SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the earlier Coupang SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like FAANG and tech startups. DataLemur SQL Interview Questions

Each SQL question has hints to guide you, step-by-step solutions and best of all, there is an interactive coding environment so you can right online code up your SQL query and have it checked.

To prep for the Coupang SQL interview you can also be useful to practice SQL questions from other tech companies like:

In case your SQL coding skills are weak, don't worry about jumping right into solving questions – go learn SQL with this DataLemur SQL tutorial.

SQL interview tutorial

This tutorial covers SQL concepts such as LAG window function and math functions – both of these pop up frequently in Coupang SQL assessments.

Coupang Data Science Interview Tips

What Do Coupang Data Science Interviews Cover?

Beyond writing SQL queries, the other types of problems covered in the Coupang Data Science Interview are:

Coupang Data Scientist

How To Prepare for Coupang Data Science Interviews?

To prepare for Coupang Data Science interviews learn more about their internal AI initiatives from the Coupang company blog. Also, read the book Ace the Data Science Interview because it's got:

  • 201 interview questions taken from tech companies like Google & Microsoft
  • a crash course on SQL, Product-Sense & ML
  • over 900+ reviews on Amazon & 4.5-star rating

Ace the Data Science Interview Book on Amazon