logo

9 Bukalapak SQL Interview Questions (Updated 2024)

Updated on

June 23, 2024

Data Science, Data Engineering and Data Analytics employees at Bukalapak write ad-hoc SQL queries all the damn time for work. They use SQL for At Bukalapak, SQL is used for analyzing user shopping patterns and optimizing inventory management algorithms in their dynamic e-commerce marketplace. That's the reason behind why Bukalapak asks prospective hires SQL interview questions.

As such, to help you prep for the Bukalapak SQL interview, we'll cover 9 Bukalapak SQL interview questions in this blog.

9 Bukalapak SQL Interview Questions

Sure, here is a formulated SQL interview question, sample data, and an answer that suits a hypothetical situation in a company like Bukalapak.

Note that the values in the sample data are created for the sole purpose of this question and do not represent real-world statistics or trends.

SQL Question 1: Identify VIP users based on purchase quantity

As a keen data scientist at Bukalapak, your task is to identify the premium users, dubbed as 'VIP users'. These are users who frequently make large orders. Precisely, a user is considered as a 'VIP user' if they have placed at least 5 orders of 10 or more items each in the past two months.

Example Input:
order_iduser_idorder_dateitems_quantity
11002022-08-1012
22002022-09-158
31002022-09-2015
43002022-07-2510
51002022-09-2813
61002022-08-0311
72002022-08-182
81002022-09-019
93002022-09-1714
101002022-08-2410

Answer:

As for the solution, please use the following PostgreSQL query:


This query filters the orders made in the past 2 months and count the number of orders from each user where item quantity is 10 or more. It then outputs the user_id of users who have at least 5 such orders, thus identifying the VIP users.

To practice a similar power-user data analysis problem question on DataLemur's free interactive SQL code editor, try this recently asked Microsoft SQL interview question: Microsoft SQL Interview Question: Teams Super User

SQL Question 2: Top 3 Department Salaries

Imagine you had a table of Bukalapak employee salary data. Write a SQL query to find the top 3 highest paid employees in each department.

Bukalapak Example Input:

employee_idnamesalarydepartment_id
1Emma Thompson38001
2Daniel Rodriguez22301
3Olivia Smith20001
4Noah Johnson68002
5Sophia Martinez17501
8William Davis68002
10James Anderson40001

Example Input:

department_iddepartment_name
1Data Analytics
2Data Science

Example Output:

department_namenamesalary
Data AnalyticsJames Anderson4000
Data AnalyticsEmma Thompson3800
Data AnalyticsDaniel Rodriguez2230
Data ScienceNoah Johnson6800
Data ScienceWilliam Davis6800

Write a SQL query for this question directly within the browser on DataLemur:

Top 3 Department Salaries

Answer:

We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.


If the code above is tough, you can find a detailed solution here: Top 3 Department Salaries.

SQL Question 3: How do you identify duplicated data in a table?

One way to find duplicatesis to use a clause and then use to find groups


You could also use the operator:


Bukalapak SQL Interview Questions

SQL Question 4: Calculate Moving Average Ratings

In Bukalapak, customer reviews are very important for evaluating the performance of products. These reviews often involve ratings given by customers which are used to calculate the average rating of a product. Recently, the product team would like to know the moving average rating, defined as the average of the most recent three reviews, of each product by month. Could you write a SQL query to provide the needed information?

Here is the dataset:

Example Input:
review_iduser_idsubmit_dateproduct_idstars
61711232022-05-18500014
78022652022-05-22698524
52933622022-05-28500013
63521922022-06-01698523
45179812022-06-05698522
52415692022-06-10500012
90257422022-06-15500014
62133792022-06-20698525

Answer:

Here is a possible PostgreSQL query:


In this query, we first generate a table where reviews are ranked by their submission date in descending order within each . Next, a moving average is calculated for each review based on the 'stars' of the current and the two preceding reviews in the table . Finally, we round the average of the the moving averages for each per month to two decimal places.

p.s. Window functions show up super often during SQL interviews, so practice the 27+ window function questions on DataLemur

SQL Interview Questions on DataLemur

SQL Question 5: What's the difference between a correlated and non-correlated sub-query?

A correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query. On the other hand, a non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query.

An example correlated sub-query:


This correlated subquery retrieves the names and salaries of Bukalapak employees who make more than the average salary for their department. The subquery references the department column in the main query's FROM clause (e1.department) and uses it to filter the rows of the subquery's FROM clause (e2.department).

An example non-correlated sub-query:


This non-correlated subquery retrieves the names and salaries of Bukalapak employees who make more than the average salary for the Data Analytics department (which honestly should be very few people since Data Analysts are awesome).

Anways, the subquery is independent of the main query and can be considered a standalone query. Its output (the average salary for the Data Analytics department) is then used in the main query to filter the rows of the table.

SQL Question 6: Find the Average Transaction Value per Seller for a Particular Month

As a data analyst at Bukalapak, an Indonesian eCommerce platform, you are tasked to analyze the transactions conducted by each seller. Please write an SQL query to find the average transaction value per seller for the month of June.

Assume there is a table with the following data for this problem:

Example Input:
transaction_idseller_idtransaction_dateproduct_idtransaction_value
110006/01/20221001500
210006/02/20221002550
320006/03/20222001400
420006/04/20222002450
510007/01/20221003600
620007/02/20222003500

Answer:

In PostgreSQL, the query would be:


This query will calculate the average transaction value per seller only for transactions conducted in the month of June. The WHERE clause is used to filter out transactions from other months, and the DATE_PART function is used to extract the month from the transaction date. The GROUP BY clause is used to group the transactions by seller, and the AVG function is used to calculate the average transaction value for each seller.

To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for analyzing transactions grouped by sellers or this Stripe Repeated Payments Question which is similar for understanding repeated transactions details.

SQL Question 7: What are stored procedures, and why are they useful?

Stored procedures in SQL are like recipes in a cookbook. Just like a recipe tells you the ingredients and instructions for making a particular dish, a stored procedure tells the DBMS the logic/statements needed to perform a specific task. Just like you can use a recipe to make the same dish over and over again, you can use a stored procedure to repeat the same task multiple times with different input parameters (which is why stored procedures are so damn useful!).

Say you were a Data Analyst working on a HR analytics project. A common sub-task you might have to do is calculate the average salary for a given department at Bukalapak, which would be perfect for a stored procedure:


To call this stored procedure and find the average salary for the Data Science department you'd execute a query like this:


SQL Question 8: Calculate the Click Through Conversion Rate for Bukalapak

You are a data analyst at Bukalapak. The Marketing Team wants to understand the effectiveness of digital advertising on the site. Specifically, they would like to know the click-through conversion rates from users viewing a product to adding a product to the cart.

Here are two tables:

Example Input:

view_iduser_idview_dateproduct_id
619112306/08/2022 00:00:0050003
781526506/11/2022 00:00:0069876
532936206/19/2022 00:00:0050003
636519207/27/2022 00:00:0069876
452898107/06/2022 00:00:0069876

Example Input:

add_iduser_idadd_dateproduct_id
512012306/08/2022 00:02:0050003
687726506/11/2022 00:03:0069876
509119207/26/2022 00:01:0069876

Write a query that calculates the click-through conversion rate from users viewing a product to adding the product to cart in the month of June.

Answer:


In this solution:

  • We create two CTEs (Common Table Expressions) named 'views' and 'adds'.
  • 'views' calculates how many distinct users viewed the products in June.
  • 'adds' calculates how many distinct users added the products to cart in June.
  • We then join these two tables on and calculate the ratio of to to get the conversion rate.

To solve a similar problem about calculating rates, try this TikTok SQL Interview Question within DataLemur's interactive coding environment: Signup Activation Rate SQL Question

SQL Question 9: Calculate Total Sales per Seller per Month

Bukalapak is an Indonesia-based e-commerce company. We have a table called 'sales' which records each sale transaction made on the platform. This table contains the details like seller_id (which is the identifier of the seller), transaction_id (unique identifier of the transaction), transaction_date (date of transaction) and transaction_amount (the amount of each transaction). Your task is to write a SQL query that outputs a table showing the total sales made by each seller per month for the year of 2023.

Example Input:
transaction_idtransaction_dateseller_idtransaction_amount
A9801/15/2023BX9500
C4301/22/2023AD5600
Z4502/12/2023BX9700
S2102/28/2023AD5900
H6503/11/2023AD5400
Expected Output:
monthsellertotal_sales
1BX9500
1AD5600
2BX9700
2AD5900
3AD5400

Answer:

Here's a SQL query that would solve the problem:


This query first filters rows of the year 2023 then it extracts the month from the transaction date and groups by it along with the seller_id. It uses the SUM function to calculate the total transaction amount for each group. This results in the total sales per seller per month in the year 2023.

How To Prepare for the Bukalapak SQL Interview

The best way to prepare for a Bukalapak SQL interview is to practice, practice, practice. Beyond just solving the earlier Bukalapak SQL interview questions, you should also solve the 200+ DataLemur interview questions which come from companies like FAANG tech companies and tech startups. DataLemur SQL and Data Science Interview Questions

Each DataLemur SQL question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and best of all, there is an interactive coding environment so you can instantly run your SQL query and have it checked.

To prep for the Bukalapak SQL interview it is also useful to practice SQL questions from other tech companies like:

In case your SQL query skills are weak, forget about diving straight into solving questions – strengthen your SQL foundations with this DataLemur SQL tutorial.

Interactive SQL tutorial

This tutorial covers SQL concepts such as handling timestamps and filtering on multiple conditions using AND/OR/NOT – both of these come up often during SQL job interviews at Bukalapak.

Bukalapak Data Science Interview Tips

What Do Bukalapak Data Science Interviews Cover?

In addition to SQL interview questions, the other types of problems tested in the Bukalapak Data Science Interview are:

  • Probability & Stats Questions
  • Python or R Coding Questions
  • Analytics and Product-Metrics Questions
  • ML Interview Questions
  • Resume-Based Behavioral Questions

Bukalapak Data Scientist

How To Prepare for Bukalapak Data Science Interviews?

I'm a tad biased, but I think the best way to study for Bukalapak Data Science interviews is to read the book I wrote: Ace the Data Science Interview.

The book solves 201 data interview questions taken from Microsoft, Amazon & startups. It also has a crash course covering SQL, Product-Sense & ML. And finally it's helped thousands of people land their dream job in data, which is how it's gotten to be an Amazon best-seller with over 1000+ 5-star reviews.

Ace the Data Science Interview