10 Kakao SQL Interview Questions (Updated 2025)

(Ex-Facebook & Best-Selling Data Science Author)

Updated on

March 27, 2025

At Kakao, SQL is used often for analyzing user behavior data for targeted marketing, and to analyze KakaoTalk product analytics data. That's why Kakao frequently asks SQL query questions during interviews for Data Science and Data Engineering positions.

To help you prepare for the Kakao SQL interview, this blog covers 10 Kakao SQL interview questions – how many can you solve?

10 Kakao SQL Interview Questions

SQL Question 1: Analyze Top User Engagement With Kakao Services

As a representative for Kakao, you have been supplied with a dataset providing detailed information about the usage habits of your users over the past month. The dataset is contained in a PostgreSQL table named 'user_activity'. Given below is a sample of the data:

Example Input:

activity_id	user_id	activity_date	service_id	duration_seconds
7241	1001	2022-08-01 08:30:00	K01	1800
8347	1002	2022-08-01 09:00:00	K02	2400
6025	1001	2022-08-01 15:30:00	K01	1800
7218	1002	2022-08-02 08:00:00	K02	3600
8360	1003	2022-08-02 10:00:00	K01	1200

Your task is to write a SQL query to find the top 3 users who have the longest total duration of using each service for every day. You will need to make use of SQL window functions to solve this problem.

Answer:

In the answer, we use window rows to compute the total duration for each user per day for each service. Every window is defined by the PARTITION BY clause, which separates our table into partitions of rows sharing the same activity_date and service_id.

Then the SUM function computes the total duration of use for each user_id within each partition. The ROW_NUMBER function assigns a unique rank to each row within a partition, ordered by the total duration in descending order.

Finally, we filter rows with a rank no higher than 3 (top 3). The result will be the top 3 users with the longest total duration of usage for each service each day.

To practice a related window function SQL problem on DataLemur's free interactive coding environment, try this Amazon BI Engineer interview question:

SQL Question 2: Analyzing User Interaction with Kakao's Various Services

Kakao Corporation provides many services like messaging (KakaoTalk), music streaming (KakaoMusic), ride-hailing (KakaoT), and a digital wallet (KakaoPay). These services can affect each other's engagement. For instance, users who chat about a newly released album on KakaoTalk might decide to listen to it later on KakaoMusic.

You've been given two tables - and , with the following schema:

interaction_id	user_id	service_id	interaction_date	interaction_duration(mins)
001	123	A	06/08/2022	10
002	265	B	06/10/2022	5
003	362	A	06/18/2022	15
004	192	C	07/26/2022	20
005	981	B	07/05/2022	5

service_id	service_name
A	KakaoTalk
B	KakaoMusic
C	KakaoT
D	KakaoPay

Assume all interactions are unique. Write a SQL query to find the average interaction duration for each service.

Answer:

This query first joins with on the field. The query then groups the data by . With each grouping of a service, it calculates the average duration of all interactions related to that service, providing a useful insight into which services are more engaging to users.

SQL Question 3: What's a primary key?

The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.

For example, say you had stored some Facebook ad campaign data that Kakao ran:

The column uniquely identifies each row in the table, and the PRIMARY KEY constraint ensures that no two rows have the same . This helps to maintain the integrity of the data in the table by preventing duplicate rows.

The primary key is also an important part of the table because it allows you to easily identify and reference specific campaigns in your Facebook Ad data. You can use it to join to other tables in the database, such as a table containing data on the results of the campaigns.

Kakao SQL Interview Questions

SQL Question 4: Finding Active Users

Kakao is interested in finding out which of their users are the most active on the platform. They would like a list of users who have sent more than 500 messages in the last 3 months and are part of at-least 10 different chat groups.

Please use the following tables for this analysis:

Example Input:

message_id	user_id	send_date	group_id	content
6181	541	12/15/2022	7032	Hello everyone
9842	681	12/16/2022	4321	What's the update on the project?
5393	541	12/18/2022	7032	We have a meeting at 5pm today
8666	982	12/20/2022	8901	Done with the task
5758	541	12/22/2022	4321	Here is the presentation for tomorrow

Example Input

user_id	group_id
541	7032
681	4321
982	8901
541	4321
541	8796

Write an SQL query to find all user who have sent more than 500 messages in the last 3 months and are part of at-least 10 different chat groups.

Answer:

This query joins the and tables on field. It then filters out the messages sent in the last 3 months. The clause groups the data by , and the clause filters the groups by users who have sent more than 500 messages and are part of at least 10 different groups.

SQL Question 5: Why might you denormalize a database?

Database denormalization is when you add redundancy to a database, and break typical normalization rules (specified by 1st, 2nd, 3rd normal forms). There's a few reasons to denormalize a database:

Improved performance: Denormalization often reduces the the number of costly join operations that are needed to retrieve data. This is helpful when the database is being used for OLAP (Online Analytical Processing) use cases, as joins can be expensive and slow.
Simplification: Denormalization can also be used to simplify the design of a database by reducing the number of tables and relationships that need to be managed. This can make it easier to understand and maintain the database.
Ease of use: Denormalization can also make it easier for users to work with a database by providing them with a more intuitive and straightforward data model.

While denormalization can be a useful tool for improving performance and scalability, it's important to keep in mind that it can make update and delete operations more complex. This is because denormalization can create data duplicates, which can make it harder to maintain data integrity. So, to avoid any headaches, it's a good idea to start with a well-normalized database design and then consider denormalization only if it's absolutely necessary for your specific performance and scalability needs. In other words, denormalization can be a bit of a wild card, so it's best to handle it with care!

SQL Question 6: Average Revenue Generated by Each Kakao Product

Kakao sells various digital products. Each product has multiple transactions every day. Can you write a SQL query to find out the average monthly revenue of each product based on the data in the "transactions" table

Here are the sample data:

Example Input:

transaction_id	product_id	transaction_date	revenue
6171	50001	06/08/2022 00:00:00	500
7802	69852	06/10/2022 00:00:00	700
5293	50001	06/18/2022 00:00:00	550
6352	69852	07/26/2022 00:00:00	710
4517	69852	07/05/2022 00:00:00	450

Example Output:

product	month	avg_revenue
50001	6	525.00
69852	6	700.00
69852	7	580.00

Answer:

In this query, we use GROUP BY along with the AVG function as asked in the question statement. The AVG function is used to find the average of 'revenue' for each 'product_id' for each 'month' of 'transaction_date'. The EXTRACT function is used to get the month from the 'transaction_date'. The result is ordered by 'product' and 'month'.

SQL Question 7: Why should you normalize your database?

Database normalization is helpful because it improves the performance, flexibility, and scalability of a database. Some specific benefits of normalization include:

Reducing Redundancy: Normalization helps to minimize redundancy by breaking down a larger general table into smaller, more specific tables. This can often reduce the amount of data that needs to be accessed for particular queries.
Improving Data Integrity: Normalization helps to ensure the integrity of the data by minimizing the risk of data inconsistencies and anomalies. It does this by establishing clear relationships between the tables via primary and foreign keys.
Increasing Flexibility: Normalization makes it easier to modify the structure of the database, as there is less redundancy, so it allows you to make changes to one table without affecting others. This makes it easier to adapt the database to changing business needs (and change is basically constant at Kakao!)
Improving Performance: Normalization can improve the performance of the database by reducing the amount of data that needs to be stored and accessed in a single table. This can result in faster query times and better overall performance.
Easier Data Modeling: Normalization can make it easier to model the data in a logical and intuitive way, since less random data is commingled together, which improves the overall database design.

SQL Question 8: Filtering User Records with LIKE Keyword

You're working with Kakao's customer records and you're asked to find all the users who live in Seoul and have an '@kakao.com' email address. For this problem, consider the following table schema for the 'users' table.

Example Input:

user_id	email	location
123	john@kakao.com	Seoul
265	mary@gmail.com	Busan
362	paul@kakao.com	Seoul
192	linda@kakao.com	Daegu
981	kim@kakao.com	Seoul

Write a SQL query that will filter down the 'users' table and return all records with an location of 'Seoul' and an email address that ends with '@kakao.com'.

Hint: Use the SQL keyword LIKE in your where clause to match the specific string pattern for the email address.

Answer:

Here is the PostgreSQL query that can be used to solve this problem:

This query will return all user records where the location is 'Seoul' and the email address ends with '@kakao.com'. The '%' symbol in the LIKE clause acts as a wildcard that matches any sequence of characters. Thus, '%@kakao.com' will match any string that ends with '@kakao.com'.

SQL Question 9: Find the total revenue per city from Kakao users

Kakao wants to analyze customer purchases and total revenue generated per city to create a growth strategy. Currently, all user data and purchase activity is stored in two separate tables: and . The user information is stored in the table which includes a unique and the residing of each user. The table stores all purchase data, with each order having a unique , the of the customer who made the purchase, the and the of each purchase.

Write a SQL query to find the total revenue per city in June 2022.

Example Input:

user_id	city
123	Seoul
265	Busan
362	Daegu
192	Incheon
981	Seoul

Example Input:

order_id	user_id	purchase_date	price
5041	123	06/08/2022 00:00:00	50000
6712	265	06/10/2022 00:00:00	69800
3923	362	06/18/2022 00:00:00	85000
7352	192	07/26/2022 00:00:00	64800
8157	981	07/05/2022 00:00:00	68000

Answer:

This query first joins the and table using the . It then filters the data to only include orders made in June 2022 using the function. Finally, it groups by and calculates the total revenue for each city using the function.

Because joins come up so often during SQL interviews, try this interactive Snapchat JOIN SQL interview question:

SQL Question 10: Can you explain the distinction between a unique and a non-unique index?

While both types of indexes improve the performance of SQL queries by providing a faster way to lookup rows of data, a unique index enforces the uniqueness of the indexed columns while a non-unique index allows duplicate values in the indexed columns.

Suppose you had a table of Kakao employees. Here's an example of a unique index on the column:

This index would ensure that no two Kakao employees have the same , which could be used as a unique identifier for each employee.

Here's a non-unique index example example on the column:

This index would not enforce uniqueness, but it could be used to improve the performance of queries that filter or sort the data based on the column. For example, if you want to quicklly retreive all Data Scientists, the database can use the index to efficiently locate and retrieve the desired records without having to do a full table scan on all Kakao employees.

How To Prepare for the Kakao SQL Interview

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Besides solving the above Kakao SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Microsoft, Google, Amazon, and tech startups.

Each interview question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and best of all, there is an interactive SQL code editor so you can easily right in the browser your SQL query answer and have it checked.

To prep for the Kakao SQL interview you can also be wise to solve SQL problems from other tech companies like:

But if your SQL skills are weak, don't worry about going right into solving questions – improve your SQL foundations with this SQL interview tutorial.

This tutorial covers SQL concepts such as WHERE with AND/OR/NOT and INTERCEPT/EXCEPT – both of these pop up routinely in SQL job interviews at Kakao.

Kakao Data Science Interview Tips

What Do Kakao Data Science Interviews Cover?

In addition to SQL interview questions, the other question categories covered in the Kakao Data Science Interview are:

Statistics & Probability Questions
Python or R Coding Questions
Business Sense and Product-Sense Questions
Machine Learning Questions
Behavioral & Resume-Based Questions

Kakao Data Scientist

How To Prepare for Kakao Data Science Interviews?

The best way to prepare for Kakao Data Science interviews is by reading Ace the Data Science Interview. The book's got:

201 Interview Questions from companies like Microsoft, Google & Amazon
A Crash Course on Python, SQL & ML
Great Reviews (900+ reviews, 4.5-star rating)

10 Kakao SQL Interview Questions (Updated 2025)

10 Kakao SQL Interview Questions

SQL Question 1: Analyze Top User Engagement With Kakao Services

Example Input:

Answer:

SQL Question 2: Analyzing User Interaction with Kakao's Various Services

Answer:

SQL Question 3: What's a primary key?

SQL Question 4: Finding Active Users

Example Input:

Example Input

Answer:

SQL Question 5: Why might you denormalize a database?

SQL Question 6: Average Revenue Generated by Each Kakao Product

Example Input:

Example Output:

Answer:

SQL Question 7: Why should you normalize your database?

SQL Question 8: Filtering User Records with LIKE Keyword

Example Input:

Answer:

SQL Question 9: Find the total revenue per city from Kakao users

Example Input:

Example Input:

Answer:

SQL Question 10: Can you explain the distinction between a unique and a non-unique index?

How To Prepare for the Kakao SQL Interview

Kakao Data Science Interview Tips

What Do Kakao Data Science Interviews Cover?

How To Prepare for Kakao Data Science Interviews?

Career Resources

Support

Interview Questions