logo

8 Mercari SQL Interview Questions (Updated 2024)

Updated on

March 3, 2024

At Mercari, SQL is used often for extracting and analyzing sales data for market trends, and cleaning and managing product listing data for accurate recommendations. For this reason Mercari asks SQL coding questions during interviews for Data Science, Analytics, and & Data Engineering jobs.

Thus, to help you practice for the Mercari SQL interview, we've collected 8 Mercari SQL interview questions – able to answer them all?

8 Mercari SQL Interview Questions

SQL Question 1: Identifying Power Users in the Mercari Platform

As a data analyst for Mercari, you are required to analyze the user activity on the platform. Power users or VIP uses are individuals who often carry out crucial activities for the business, such as posting listings, making purchases, and providing reviews. Consequently, your task is to identify users who have listed more than 50 items and made more than 20 purchases within the last year.

For this SQL question, you are given two tables: and .

listing_iduser_idlisting_dateproduct_id
617112306/08/202150001
780212306/10/202169852
529312306/18/202150001
635219207/26/202169852
451798107/05/202169852

purchase_iduser_idpurchase_dateproduct_id
891012306/18/202150001
473219207/26/202169852
236598107/05/202169852
784512306/10/202169852
693212306/08/202150001

Note: Here is a unique identifier associated with each user. and are the dates when a user listed a product and made a purchase, respectively.

Answer:


This query first creates two subqueries: one to count the number of products listed by each user in the past year (), and another to count the number of products purchased by each user in the past year (). The main query then joins these two subqueries on and filters for users who have listed over 50 items and purchased over 20 items in the past year.

To solve a related super-user data analysis question on DataLemur's free online SQL code editor, try this recently asked Microsoft SQL interview question: Microsoft SQL Interview Question: Teams Super User

SQL Question 2: Ranking Products with Average Revenue

You have been assigned the task of creating an analysis report for Mercari, the largest community-powered shopping app in Japan. Your boss wants to see a monthly ranking of the products based on total revenue, and how the revenue changes compared with the previous month.

You are given the table "sales", including the columns "sale_id"(unique identifier for the sales), "product_id"(id of the product sold), "user_id"(id of the user who purchased the product), "sale_date"(the date when the sale took place), and "revenue"(the revenue generated from the sale).

Example Input:
sale_iduser_idsale_dateproduct_idrevenue
100132106/01/202288102500
100265406/10/202288103200
100398706/15/202239101000
100432107/01/202288102700
100544407/10/202239101200
Example Output:
year_monthproduct_idrevenueprev_month_revenuerevenue_change
2022-0688105700nullnull
2022-0639101000nullnull
2022-07881027005700-3000
2022-07391012001000200

Answer:


This query first calculates the monthly total revenue for each product in a CTE (Common Table Expression), then it calculates the revenue of the previous month for each product with the LAG window function. In the final SELECT statement, it calculates the change in revenue from the previous month and returns the result. The COALESCE function is used to deal with the null value for the first month's data, replacing it with 'null'.

Pro Tip: Window functions are a popular SQL interview topic, so practice every window function problem on DataLemur

DataLemur SQL Questions

SQL Question 3: Would a UNION ALL and a FULL OUTER JOIN produce the same result?

No, in 99% of normal cases a and do NOT produce the same result.

You can think of more as set addition when it combines the results of two tables.

You can think of a more as set multiplication, producing all combos of table 1 with combos of table 2.

Mercari SQL Interview Questions

SQL Question 4: Mercari User Transaction Analysis

You are working as a data scientist at Mercari, where you are tasked with analyzing data related to user transactions. The user transaction table includes details such as User ID, Product ID, Transaction time and Transaction price.

Your task is to find out the total transaction price for each user in the month of June.

Example Input
transaction_iduser_idtransaction_timeproduct_idtransaction_price
11112306/01/2022 00:00:005000150
11226506/05/2022 00:00:0069852100
11312306/10/2022 00:00:005000175
11419207/20/2022 00:00:0069852150
11598107/05/2022 00:00:0069852200

Answer:


This query works by extracting the month from the transaction time and only considering the records where the month of the transaction is June. It groups by the user_id and sums up the transaction_price to get the total transaction price for each user for June.

Expected output, given the above table as input:

user_idmonthtotal_price
1236125
2656100

Then the question assesses the candidate's ability to extract meaningful information from transaction data, a common task for data scientists working in e-commerce companies. It also tests the candidate's knowledge of SQL functions such as , , and attention to date-based queries.

SQL Question 5: What are the similarities and difference between relational and NoSQL databases?

A non-relational (NoSQL) database is any database that does not use the typical tabular format of rows and columns like in relational databases.

While knowing the four different types of NoSQL databases is beyond the scope of most Data Analyst & Data Science interviews, Data Engineers at Mercari should vaguely refresh these concepts:

  • Document Databases – this database is designed for storing and querying retrieving document data (where each key is associated with a flexible document)
  • Key-Value Stores – these databases uses keys where each key is associated with only one value in a collection (similar to a Python dictionary data structure!)
  • Wide-Column Stores – this database uses tables, rows, and columns, but unlike a relational database, the names and format of the columns can vary from row to row within the same table. Remember: "-With great flexibility comes great responsiblity-" – Batman if he was learning about NoSQL databases!
  • Graph Stores – represents data in terms of the graph data structure, with nodes and edges between entities

SQL Question 6: Filter Customers at Mercari

Given the Mercari customer database, write a SQL query to filter out customers who made purchases in the last 6 months and spent over 1000 USD in total. Also, the returned results should include only those customers who have given an average review rating of at least 4 for their purchased products.

Example Input:
customer_idfirst_namelast_nameemailjoin_date
3221JohnDoejohn.doe@email.com06/13/2020
9783JaneSmithjane.smith@email.com02/07/2019
4532EmilyJohnsonemily.johnson@email.com11/21/2021
Example Input:
purchase_idcustomer_idproduct_idpurchase_dateamount_spent
110132215890104/12/2022200.00
187297833784205/16/2022800.00
213432214012706/05/2022890.00
Example Input:
review_idcustomer_idproduct_idreview_datestars
413232215890104/20/20225
789032214012706/12/20224
567897833784205/23/20223

Answer:


This query first filters customers from the table who made a purchase within the last 6 months and spent more than 1000 USD in total. Then from this filtered list, it further filters customers whose average review rating is at least 4. Finally, these filtered customers are returned from the table.

SQL Question 7: What is denormalization?

Denormalization is the practice of altering a database schema in a way that breaks the normalization rules (1st, 2nd, 3rd normal forms).

Denormalization is done to improve the performance of the database when it is being used for online analytics processing (OLAP), rather than online transaction processing (OLTP) use cases.

For these OLAP use cases, you're bottleneck frequently is joining multiple tables, but de-normalizing your database cuts these outs.

Denormalization can improve query performance by reducing the number of expensive joins required to retrieve data, but it comes with its own drawbacks too. By adding redundant columns, you incur more data storage costs, and there's need for more complex update and delete operations in order to maintain data integrity across all the duplicated data. Thus, it's important to carefully consider the trade-offs involved before implementing denormalization.

SQL Question 8: Calculate The Average Price Of Products Sold Each Month

Assume we're working with a platform like Mercari which is an online marketplace for buying and selling items. You want to answer the following question: What was the average price of the products sold each month?

Example Input:
sale_idproduct_idsell_dateprice
1015506/01/2022700
2027506/02/2022150
3036707/20/2022950
4048907/25/2022500
5054508/03/20221200
Example Output:
monthavg_price
6425
7725
81200

Answer:


This query first extracts the month from the using the function. It then groups by the extracted month to calculate the average for each group which is each month. The is used to sort the results by the month.

Preparing For The Mercari SQL Interview

The best way to prepare for a Mercari SQL interview is to practice, practice, practice. Beyond just solving the earlier Mercari SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon. DataLemur Question Bank

Each problem on DataLemur has hints to guide you, step-by-step solutions and best of all, there's an online SQL code editor so you can instantly run your query and have it executed.

To prep for the Mercari SQL interview you can also be wise to practice SQL problems from other tech companies like:

But if your SQL foundations are weak, forget about going right into solving questions – improve your SQL foundations with this DataLemur SQL tutorial.

SQL tutorial for Data Analytics

This tutorial covers things like handling timestamps and rank window functions – both of these come up frequently in Mercari interviews.

Mercari Data Science Interview Tips

What Do Mercari Data Science Interviews Cover?

In addition to SQL query questions, the other topics to prepare for the Mercari Data Science Interview are:

  • Statistics and Probability Questions
  • Python or R Programming Questions
  • Business Sense and Product-Sense Questions
  • ML Interview Questions
  • Behavioral Interview Questions

Mercari Data Scientist

How To Prepare for Mercari Data Science Interviews?

To prepare for Mercari Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from companies like Microsoft, Google & Amazon
  • a crash course covering SQL, AB Testing & ML
  • over 900+ 5-star reviews on Amazon

Ace the DS Interview