logo

11 CyberAgent SQL Interview Questions (Updated 2024)

Updated on

March 3, 2024

At CyberAgent, SQL is used day-to-day for extracting and analyzing ad performance data, and managing user information for personalized ad targeting, essential in the digital advertising industry. So, it shouldn't surprise you that CyberAgent typically asks SQL coding questions during interviews for Data Science, Data Engineering and Data Analytics jobs.

To help you study for the CyberAgent SQL interview, we've curated 11 CyberAgent SQL interview questions – can you solve them?

11 CyberAgent SQL Interview Questions

SQL Question 1: Finding Top Active Users for CyberAgent

CyberAgent is a leading global online advertising agency, and activity analysis is critical. Suppose your role here is a data analyst. CyberAgent has a database table "user_activity", which logs all user activities. However, you are specifically interested in a special group of users - those who have logged in more than 25 times in the past month. Your task is to write a SQL query to identify these "power users".

Example Input:
activity_iduser_idactivity_typeactivity_date
1101"login"2022-09-01
2102"login"2022-09-01
3101"login"2022-09-02
4103"view_ad"2022-09-02
5102"login"2022-09-03
6104"login"2022-09-04
7101"login"2022-09-04
............

Answer:


This SQL block selects the user_id and counts the number of login activities for each user in the past month. It then filters out only those users who have logged in more than 25 times. These are the 'power users' for the CyberAgent network.

To solve a similar VIP customer analysis question on DataLemur's free interactive coding environment, try this Microsoft SQL Interview problem: Microsoft SQL Interview Question: Super Cloud Customer

SQL Question 2: Calculate Monthly Average Rating count for each product

CyberAgent would like to better understand their product performance by analyzing the monthly average ratings for each product.

For each product, Could you please write a SQL query to calculate the average rating for each product on a monthly basis. The month should be derived from the , it will be scored in the format of YYYYMM.

The table is formatted as below,

Example Input:
review_iduser_idsubmit_dateproduct_idstars
617112306/08/2022 00:00:00500014
780226506/10/2022 00:00:00698524
529336206/18/2022 00:00:00500013
635219207/26/2022 00:00:00698523
451798107/05/2022 00:00:00698522

We hope the output report shows up as below,

Example Output:
monthproductavg_stars
202206500013.50
202206698524.00
202207698522.50

Answer:


In the above SQL query, we first convert into the YYYYMM format using PostgreSQL function. Then, we group the data by and , and we calculate the average for each group. Finally, we order the result by and to get the required output.

Pro Tip: Window functions are a popular SQL interview topic, so practice every window function problem on DataLemur

DataLemur SQL Questions

SQL Question 3: What is database denormalization?

Denormalization is the practice of altering a database schema in a way that breaks the normalization rules (1NF, 2NF, 3NF, etc.).

This is typically done to improve the performance of the database when it is being used for online analytics processing (OLAP), rather than online transaction processing (OLTP) use cases.

Denormalization can improve query performance by reducing the number of expensive joins required to retrieve data, but it comes with it's own drawbacks too. By adding redundant columns, you incur more data storage costs, and there's need for more complex update and delete operations in order to maintain data integrity across all the duplicated data. Thus, it's important to carefully consider the trade-offs involved before implementing denormalization.

CyberAgent SQL Interview Questions

SQL Question 4: Filter Client Data

Given the and SQL tables, you're required to write a query to filter down the client records that reflect a transaction-sum of over 10000 in the year 2022, AND indicate if the client primarily communicates in English.

Table:

client_idnameprimary_language
101John SmithEnglish
102Marie GomezSpanish
103Tom RichardsonEnglish
104Ana PerezEnglish
105Hiroshi TanakaJapanese

Table:

trans_idclient_idtransaction_dateamount
200110101/15/20225000
200210202/17/20222000
200310103/30/20226000
200410404/25/20224000
200510305/01/20227000

Answer:


In this query, we are joining clients and transactions tables on . We only include the records where lies in the year 2022 and of the client is English. Using group by, we sum up the transaction amounts client-wise. The Having clause then filters those clients for whom this sum is more than 10000.

SQL Question 5: Describe the difference between UNION and UNION ALL.

The operator merges the output of two or more statements into a single result set. It ignores duplicated rows, and makes sure each row in the result set is unique.

For a concrete example, say you were a Data Analyst at CyberAgent working on a Marketing Analytics project. If you needed to get the combined result set of both CyberAgent's Google and Facebook ads you could execute this SQL query:


The operator works in a similar way to combine data from multiple statements, but it differs from the operator when it comes to handling duplicate rows. Whereas filters out duplicates (so if the same ad_name is run on both Facebook and Google, it only shows up once), outputs duplicate rows.

SQL Question 6: Find the Average Duration of Ad Campaigns per Client

Suppose in CyberAgent, your task is to find out the average duration in days of ad campaigns for each client. For this purpose, make sure the and are both included in the duration calculation. The company maintains a table called with the following structure and some example data:

Example Input:
campaign_idclient_idstart_dateend_date
101200107/01/202207/07/2022
102200207/02/202207/08/2022
103200107/10/202207/20/2022
104200107/22/202207/29/2022
105200207/15/202207/18/2022

We want output in this form:

Example Output:
client_idavg_duration
20018.667
20024.5

The average duration should be rounded up to three decimal places.

Answer:


In this SQL query, we used the GROUP BY clause to group our data by clients (client_id). For each group, we calculated the difference between end_date and start_date for every campaign and added 1 to include the end date. The average of those durations was then calculated using the AVG aggregate function. To make it more readable, we rounded the result to three decimal places using the ROUND function.

SQL Question 7: What's a cross-join, and why are they used?

A cross-join, also known as a cartesian join, is a type of JOIN operation in SQL that creates a new table by combining each row from the first table with every row from the second table. This results in a table with a row for every possible combination of rows from the two input tables.

An example of when this might be useful is if you wanted to first make a dataset containing all possible pairs of customers and products data, in order to later build a Machine Learning model to predict the probability of a customer purchasing a particular product.


However, it is important to note that cross-joins can create very large tables, especially if the input tables have a large number of rows. For example, if you had 10,000 customers and 5,000 different product SKUs, the resulting cross-join would have 50 million rows.

SQL Question 8: Filter Customer Records with LIKE Keyword

As a data analyst at CyberAgent, your task is to create a SQL query that would allow us to filter all of the customers whose names begin with the string 'Cyb'. Given the table 'customers', how would you structure your SQL query?

Example Input:

customer_idnamesign_up_dateproduct_purchasedamount_spent
1001CyberGuru06/08/2022GameX500
1002DigitalAgent06/10/2022AppY200
1003CyberNinja06/18/2022GameX700
1004WebMaster07/26/2022AppY300
1005CodeWarrior07/05/2022GameX600

Example Output:

customer_idnamesign_up_dateproduct_purchasedamount_spent
1001CyberGuru06/08/2022GameX500
1003CyberNinja06/18/2022GameX700

Answer:


The character is used as a wildcard in SQL and matches any sequence of characters. Therefore this query would return any customers whose names start with the string 'Cyb'.

SQL Question 9: Retrieve Customer Behaviour based on Purchase History and Login Activity

We have two tables - and . The table records the purchase history of customers with fields , , , and . The table captures login activity of users with fields , , , and .

Write a SQL query to analyze these two databases such that it fetches customer ID (), total number of products purchased (), total amount spent (), the most recent purchase date (), and the number of login sessions () for each customer.

Given the following sample data:

Example Input:
user_idproduct_idpurchase_datepurchase_amount
10030001/05/2022150
10020003/07/2022250
10130004/09/2022100
10140002/06/2022200
10250001/07/2022400
Example Input:
user_idsession_idlogin_datelogout_time
100600128/03/202230/03/2022
100600207/04/202210/04/2022
101600313/02/202215/02/2022
102600409/01/202211/01/2022
102600506/02/202208/02/2022

Answer:


This PostgreSQL query joins and tables on and aggregates the data per to get the desired output. By using we get and , gives us and provides for each user. The ensures that we don't miss any customers from the table even if they do not have matching entries in the table.

Because joins come up so often during SQL interviews, practice this interactive Snapchat Join SQL question: Snapchat SQL Interview question using JOINS

SQL Question 10: What's the SQL command do, and when would you use it?

If you have two tables and want to retrieve only the rows that are present in both tables, just use the operator!

For example, let's use to find all of CyberAgent's Facebook video ads that are also being run on YouTube:


SQL Question 11: Computing the weighted rating for users' game scores

A game developing department at CyberAgent regularly collects scores/ratings from users who play their games. The team wants to calculate the weighted score to make the scoring more accurate, knowing that the weight of each score is the square root of its frequency.

Use the table which contains users' game scores as shown below:

Example Input
user_idgame_idscore
110070
110180
210090
210090
2101100
310080
310170
410060
410150

Assume that each game can be played multiple times by different users and users can play different games. The task is to write a SQL query that calculates the weighted average score for each game.

The weighted score is calculated as follows: , where the frequency is the square root of the number of times the score appears for a game.

Example Output:
game_idweighted_score
10074.37
10175.00

Answer:

Here is a PostgreSQL query that would solve this task:


This query starts with a Common Table Expression (CTE) that groups the table by and and computes the frequency of each score for each game.

The next CTE, , calculates the weighted score which is the product of and square root of its , and also calculates the weight which is the square root of the for each and .

Finally, we group the by and calculate the weighted score using the formula and round it to two decimal places. The clause is used to sort the result by .

To practice a very similar question try this interactive Alibaba Compressed Mean Question which is similar for computation based on frequency or this Alibaba Compressed Mode Question which is similar for analysis based on frequency.

CyberAgent SQL Interview Tips

The key to acing a CyberAgent SQL interview is to practice, practice, and then practice some more! In addition to solving the earlier CyberAgent SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Netflix, Google, and Amazon. DataLemur SQL Interview Questions

Each problem on DataLemur has multiple hints, step-by-step solutions and best of all, there's an online SQL coding environment so you can easily right in the browser your SQL query answer and have it graded.

To prep for the CyberAgent SQL interview it is also useful to practice SQL questions from other tech companies like:

But if your SQL query skills are weak, don't worry about jumping right into solving questions – strengthen your SQL foundations with this SQL interview tutorial.

SQL interview tutorial

This tutorial covers SQL concepts such as filtering strings based on patterns and Subqueries – both of which pop up often in CyberAgent SQL assessments.

CyberAgent Data Science Interview Tips

What Do CyberAgent Data Science Interviews Cover?

In addition to SQL query questions, the other types of problems tested in the CyberAgent Data Science Interview are:

CyberAgent Data Scientist

How To Prepare for CyberAgent Data Science Interviews?

The best way to prepare for CyberAgent Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from tech companies like Google & Microsoft
  • A Refresher on Stats, SQL & ML
  • Great Reviews (900+ 5-star reviews on Amazon)

Ace the Data Science Interview by Nick Singh Kevin Huo