At Perion Network, SQL does the heavy lifting for managing and analyzing advertising data, and for extracting insights from user behavior patterns across different digital media platforms. That's the reason behind why Perion Network often tests SQL query questions in interviews for Data Science, Data Engineering and Data Analytics jobs.
Thus, to help you practice, here's 11 Perion Network SQL interview questions – can you answer each one?
In the Perion Network, power users are defined as companies who have made advertising purchases of at least $1000 per month for three consecutive months. We have two tables with relevant data. The table holds unique identification for each user and the table keeps track of the product purchases, their cost, and the timestamps of the purchases made by each user.
Write a PostgreSQL query that would be able to select the power users from the database, and to provide a summary list of power users which includes the total spending per user per month for the last three months.
user_id | user_name |
---|---|
101 | JohnDoe |
102 | JaneSmith |
103 | AliceJohnson |
purchase_id | user_id | timestamp | product_id | purchase_amount |
---|---|---|---|---|
2345 | 101 | 2022-01-12 10:00:00 | 777 | 540 |
2346 | 102 | 2022-02-14 14:00:00 | 888 | 450 |
2347 | 103 | 2022-02-20 16:00:00 | 777 | 400 |
2348 | 101 | 2022-02-28 17:00:00 | 888 | 700 |
2349 | 101 | 2022-03-15 11:00:00 | 777 | 600 |
In the above query, we first calculate a monthly total for each user. Then for each user, for each month, we calculate the sum of the purchases made in the current month and the preceding two months. If the count of such months is 3 and the total is at least 3000, we identify such users as power users.
To work on a similar customer analytics SQL question where you can solve it interactively and have your SQL query automatically checked, try this Walmart SQL Interview Question:
Assume you had a table of Perion Network employee salary data. Write a SQL query to find the employees who earn more than their direct manager.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns 8,000, surpassing her manager William Davis who earns $7,800.
You can solve this question interactively on DataLemur:
First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
If the code above is confusing, you can find a step-by-step solution here: Employee Salaries Higher Than Their Manager.
A database index is a data structure that provides a quick lookup of data in a column or columns of a table.
There are several types of indexes that can be used in a database:
Perion Network wants to understand the cumulative behavior of their users in terms of product usage. They are interested in a month by month breakdown of cumulative active users (users who submit at least one review in the month) for each product in the company portfolio.
Assume that you have been provided a 'reviews' table containing the following columns: review_id (integer), user_id (integer), submit_date (timestamp), product_id (integer), and stars (integer - the rating given to the product).
Calculate the cumulative active users for each product on a monthly basis.
Sample input data from the 'reviews' table:
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
1 | 1 | 2022-01-01 00:00:00 | 1 | 5 |
2 | 2 | 2022-01-01 00:00:00 | 2 | 4 |
3 | 1 | 2022-02-01 00:00:00 | 1 | 3 |
4 | 2 | 2022-02-01 00:00:00 | 2 | 4 |
5 | 3 | 2022-03-01 00:00:00 | 1 | 5 |
6 | 1 | 2022-03-01 00:00:00 | 2 | 5 |
Our expected output should show distinct users by month who've submitted reviews for both product 1 and product 2 over time.
mth | product_id | cumulative_active_users |
---|---|---|
2022-01-01 | 1 | 1 |
2022-01-01 | 2 | 1 |
2022-02-01 | 1 | 1 |
2022-02-01 | 2 | 2 |
2022-03-01 | 1 | 2 |
2022-03-01 | 2 | 3 |
You would be required to use PostgreSQL Window Functions. Your solution might look like this:
This solution partitions data by product_id and, for each partition, orders data by the month and calculates the cumulative count of distinct users. The result is a list of each product's cumulative count of distinct users month by month.
p.s. Window functions show up super frequently during SQL interviews, so practice the 27+ window function questions on DataLemur
The / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.
Note that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle (but don't stress about knowing which DBMS supports what exact commands since the interviewers at Perion Network should be lenient!).
For a tangible example in PostgreSQL, suppose you were doing an HR Analytics project for Perion Network, and had access to Perion Network's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables. You could use operator to find all contractors who never were a employee using this query:
Perion Network is a global technology company specialized in delivering advertising solutions to brands and publishers. You have been asked to design a database to track the impact of online advertisements on product sales.
The table has columns for , , (the date the ad was displayed), and (clicks on the ad).
The table has columns for , , , and (the revenue from the sale).
You must write a SQL query that provides the total sales revenue for each product in the month following the advertisement display.
ad_id | product_id | display_date | clicks |
---|---|---|---|
1 | 101 | 2022-06-01 | 500 |
2 | 102 | 2022-06-02 | 300 |
3 | 103 | 2022-07-01 | 700 |
4 | 104 | 2022-07-02 | 200 |
sale_id | product_id | sale_date | amount |
---|---|---|---|
1 | 101 | 2022-07-15 | 1000 |
2 | 101 | 2022-07-20 | 1500 |
3 | 102 | 2022-07-05 | 800 |
4 | 103 | 2022-08-15 | 300 |
5 | 103 | 2022-08-18 | 200 |
6 | 104 | 2022-08-05 | 400 |
product_id | total_sales |
---|---|
101 | 2500 |
102 | 800 |
103 | 500 |
104 | 400 |
In this query, we first join the and tables on the product_id. We then select the sales where the sale_date is within the month following the advertisement display (obtained by adding 1 month to ). Finally, we group by and sum up the to get the total sales for each product.
To discover records in one table that are not present in another, you can utilize a and filter out any values in the right-side table.
For example, say you had a table of Perion Network customers and a 2nd table of all purchases made with Perion Network. To find all customers who did not make a purchase, you'd use the following
This query fetches all rows from the customers table, along with any rows that match in the purchases table. If there is no matching row in the purchases table, values will be returned for all of the right table's columns. The clause then filters out any rows where the purchases.id column is NULL, leaving only customers who have not made a purchase.
Perion Network, a global technology company, needs to find information about the average revenue generated by each product in the past 6 months. However, the data should include only those users who are active, i.e., their last login was within the previous month. Using the two tables below, write a SQL query to achieve this.
product_id | product_name |
---|---|
50001 | Product A |
69852 | Product B |
35012 | Product C |
98022 | Product D |
transaction_id | product_id | customer_id | transaction_date | revenue | last_login |
---|---|---|---|---|---|
101 | 50001 | 10 | 2022-04-10 | 20 | 2022-08-10 |
102 | 69852 | 14 | 2022-04-15 | 25 | 2022-02-01 |
103 | 50001 | 14 | 2022-08-05 | 30 | 2022-08-10 |
104 | 98022 | 20 | 2021-12-26 | 15 | 2022-07-20 |
105 | 35012 | 10 | 2022-06-10 | 10 | 2022-08-12 |
This query will join the table with the table on the . It then applies the conditions that the transactions must have occurred within the last 6 months and the users need to be active, i.e., their last login should be within the last month. The output will present the average revenue for each product, including only the transactions from active users in the last 6 months.
Considering that Perion Network is a technology company involved in advertising solutions, a relevant question might be to find out the average number of ads clicked by each user for each campaign.
Let's imagine that we have a table called that keeps track of each time a user clicks on an ad, divided by campaign:
click_id | user_id | campaign_id | click_date |
---|---|---|---|
101 | 25 | 1 | 01/01/2023 00:00:00 |
102 | 26 | 1 | 01/02/2023 00:00:00 |
103 | 25 | 2 | 01/03/2023 00:00:00 |
104 | 25 | 2 | 01/04/2023 00:00:00 |
105 | 26 | 1 | 01/05/2023 00:00:00 |
106 | 27 | 2 | 01/06/2023 00:00:00 |
107 | 26 | 2 | 01/07/2023 00:00:00 |
The question would then be: "What is the average number of ads clicked per user for each campaign?"
You can answer it using the AVG function in combination with the GROUP BY clause in SQL like this:
This query first creates a subquery that counts the number of clicks for each user for each campaign (creating the column), then takes the average of those counts for each campaign. The result would be the average number of clicks per user for each campaign.
campaign_id | avg_click_count |
---|---|
1 | 1.5 |
2 | 1.33 |
Campaign 1 has an average of 1.5 clicks per user while campaign 2 has an average of 1.33 clicks per user.
To practice a very similar question try this interactive Facebook App Click-through Rate (CTR) Question which is similar for measuring click activity or this Twitter Histogram of Tweets Question which is similar for user click tracking.
The COALESCE() function can be used to replace NULL values with a specified value. For instance, if you are a Data Analyst at a company and are working on a customer analytics project to identify the most engaged customers, you may have access to the following data in the table:
customer_id | email_engagement | sms_engagement |
---|---|---|
101 | moderately_active | not_opted_in |
201 | un-subscribed | NULL |
301 | NULL | not_opted_in |
401 | not_active | very_active |
501 | very_active | very_active |
601 | NULL | NULL |
Before you could procede with your analysis, you would need to remove the NULLs, and replace them with the default value for email engagement (not_active), and the default sms_engagement level (not_opted_in).
To do this, you'd run the following query:
This would get you the following output:
customer_id | email_engagement | sms_engagement |
---|---|---|
101 | moderately_active | not_opted_in |
102 | un-subscribed | not_opted_in |
103 | not_active | not_opted_in |
104 | not_active | very_active |
105 | very_active | very_active |
106 | not_active | not_opted_in |
For a company like Perion Network that is focused on delivering advertising and monetizing solutions, understanding click through conversion rates can be critical. The click through conversion rate measures how many viewers of a product actually go ahead to add the product to their cart. This exercise will require you to calculate the click-through conversion rates for products listed on a particular day.
You are given two tables – and – which capture the data for when users viewed a product and when they added a product to their cart, respectively.
product_id | user_id | view_date |
---|---|---|
101 | 12 | 06/08/2022 00:00:00 |
101 | 37 | 06/08/2022 00:00:00 |
152 | 12 | 06/10/2022 00:00:00 |
201 | 46 | 06/18/2022 00:00:00 |
101 | 12 | 06/20/2022 00:00:00 |
product_id | user_id | add_date |
---|---|---|
101 | 12 | 06/08/2022 00:00:00 |
201 | 46 | 06/18/2022 00:00:00 |
152 | 12 | 06/11/2022 00:00:00 |
201 | 46 | 06/18/2022 00:00:00 |
101 | 12 | 06/20/2022 00:00:00 |
Use these two tables to calculate the click-through conversion rate - the ratio of the number of times a product was added to the cart to the number of times it was viewed, for each product on each day.
This query first counts the number of views and adds for each product and each date separately. Then joins the two derived tables on product_id and date and calculates the conversion rate as ratio of counts of adds to views. If a product was viewed on a certain day but not added to cart, the LEFT JOIN ensures that the product is still included in the results with a conversion rate of 0.
To solve a similar SQL problem on DataLemur's free interactive SQL code editor, solve this Meta SQL interview question:
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Perion Network SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the earlier Perion Network SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Amazon, Microsoft, Meta, and smaller tech companies.
Each exercise has multiple hints, step-by-step solutions and best of all, there's an interactive coding environment so you can right online code up your query and have it graded.
To prep for the Perion Network SQL interview you can also be helpful to solve SQL questions from other tech companies like:
In case your SQL query skills are weak, don't worry about diving straight into solving questions – go learn SQL with this interactive SQL tutorial.
This tutorial covers SQL topics like LEAD/LAG window functions and different types of joins – both of which pop up routinely in Perion Network interviews.
Besides SQL interview questions, the other types of questions to prepare for the Perion Network Data Science Interview are:
First, understand how Perion uses AI/Data Science. Then, you should read Ace the Data Science Interview.
The book solves 201 data interview questions taken from Facebook, Google & startups. It also has a refresher covering Stats, SQL & ML. And finally it's vouched for by the data community, which is how it's gotten to be an Amazon best-seller with over 1000+ 5-star reviews.