At Adevinta, SQL does the heavy lifting for analyzing user interaction data to optimize user experience, and managing large databases to ensure targeted advertising efficiency. That's why Adevinta frequently asks SQL problems during interviews for Data Analytics, Data Science, and Data Engineering jobs.
As such, to help you prepare for the Adevinta SQL interview, this blog covers 11 Adevinta SQL interview questions – able to answer them all?
Adevinta is a company that values its power users, which are classified as customers who made the highest number of purchases within a specific time frame. As an SQL analyst, your task is to write a query to identify these power users. Key informations needed to perform this analysis would be 'user_id', 'product_id', 'purchase_id', and 'purchase_date'.
Below is a snapshot of the table for Adevinta.
purchase_id | user_id | product_id | purchase_date |
---|---|---|---|
1011 | 105 | 10001 | 2022-06-18 |
1032 | 202 | 10002 | 2022-07-10 |
1078 | 202 | 10001 | 2022-07-15 |
1183 | 306 | 10003 | 2022-07-22 |
1105 | 202 | 10004 | 2022-08-05 |
You are expected to write a query that reads this table and finds out the users who have the highest total number of purchases and classify them as power users. The output should have the customer_id, total number of purchases and classification.
This PostgreSQL query first uses a common table expression (CTE) named to group the purchases by the and calculate the total number of purchases for each user. Then it selects from this CTE the , , and a derived column that labels each user as a 'Power User' if they have 3 or more total purchases, otherwise it labels them as a 'Regular User'. Lastly, the query orders the results by the total number of purchases in descending order.
To solve a super-customer analysis question on DataLemur's free interactive SQL code editor, try this Microsoft SQL Interview problem:
Adevinta is a digital marketplace specialist, offering platforms to connect sellers and buyers, presenting a variety of goods and services. A potential dataset, therefore, could relate to user activity, in particular, ad postings on the site.
Let's assume that we have a table called which records every ad posting that a user makes. Now, company management wants to understand user behavior and see if there are any patterns in the frequency of ad postings. More specifically, they want to know, for a given user, which month had the most posting activity between January 2019 and December 2019.
activity_id | user_id | post_date | ad_id |
---|---|---|---|
01 | 123 | 02/08/2019 00:00:00 | 9836 |
02 | 456 | 02/10/2019 00:00:00 | 7486 |
03 | 123 | 06/18/2019 00:00:00 | 2698 |
04 | 789 | 07/26/2019 00:00:00 | 5623 |
05 | 456 | 07/05/2019 00:00:00 | 1249 |
06 | 123 | 12/01/2019 00:00:00 | 7486 |
Above query is using PARTITION BY to create a window function grouped by and . It performs a count of rows inside each window, effectively counting number of ads each user posted in each month. The WHERE clause is ensuring we only consider data from 2019. The result is sorted by and for ease of reading.
To solve another window function question on DataLemur's free interactive coding environment, try this Google SQL Interview Question:
Denormalization is the process of modifying a database schema in a way that deviates from the typical rules of normalization (1NF, 2NF, 3NF, etc.).
Denormalization is often used to improve the performance of a database, particularly when it is being used for reporting and analytical purposes (rather than in an Online Transaction Processing (OLTP) manager).
By duplicating data, denormalization can reduce the number of expensive joins required to retrieve data, which can improve query performance. However, denormalization can also cause problems such as increased data redundancy and the need for more complex update and delete operations.
Adevinta is a global online classifieds specialist. For this hypothetical scenario, assume we have three tables - 'Users', 'Listings', and 'Transactions'. The 'Users' table includes information about the users listing their items, 'Listings' includes information about the items listed and 'Transactions' includes information about the transactions (buy/sell) happening against each listing.
Our goal is to determine the total revenue generated by product categories in a specific year (e.g., 2022).
Below are the example tables:
Column Name | Description |
---|---|
user_id | Unique ID of the listing's creator |
user_location | Location of the user |
Sample Records:
user_id | user_location |
---|---|
123 | France |
265 | Spain |
362 | Norway |
Column Name | Description |
---|---|
listing_id | Unique ID for the listing |
user_id | The ID of the user who created the listing |
product_category | Category of the product |
listing_price | Price at which the item is listed |
Sample Records:
listing_id | user_id | product_category | listing_price |
---|---|---|---|
50001 | 123 | Electronics | 150 |
69852 | 265 | Automobiles | 2000 |
50002 | 362 | Furniture | 100 |
Column Name | Description |
---|---|
transaction_id | Unique ID for the transaction |
listing_id | The ID of the listing the transaction is related to |
transaction_date | The date the transaction was executed |
Sample Records:
transaction_id | listing_id | transaction_date |
---|---|---|
6171 | 50001 | 2022-06-08 |
7802 | 69852 | 2022-06-10 |
5293 | 50002 | 2022-06-18 |
6352 | 69852 | 2022-07-26 |
4517 | 69852 | 2022-07-05 |
Here is a PostgreSQL query that solves the problem:
This query first joins 'Listings' and 'Transactions' on 'listing_id'. Then, it only considers the transactions made in the year 2022 and finally groups the data by 'product_category' to calculate the total revenue.
A NULL value represents a missing or unknown value. It is different from a zero or a blank space, which are actual values. It is important to handle NULL values properly in SQL because they can cause unexpected results if not treated correctly.
Assume Adevinta, a digital marketplace that offers ads and product listings, wants to track and analyze the click-through conversion rates for their product ads. Specifically, they are interested in finding out the conversion rate of users who viewed a product ad and subsequently added that product to their cart in the same session.
The following table shows a simplified structure of two tables that Adevinta might have: and . The table records each time a user clicks on a product ad, and the table tracks each time a user adds a product to their cart.
click_id | user_id | session_id | product_id | ad_click_timestamp |
---|---|---|---|---|
1 | 200 | 105 | 3001 | 06/08/2022 10:00:00 |
2 | 200 | 105 | 3002 | 06/08/2022 10:10:00 |
3 | 150 | 203 | 3003 | 06/10/2022 12:00:00 |
4 | 150 | 989 | 3001 | 06/11/2022 15:00:00 |
5 | 400 | 473 | 3002 | 06/18/2022 20:00:00 |
add_id | user_id | session_id | product_id | cart_add_timestamp |
---|---|---|---|---|
1 | 200 | 105 | 3001 | 06/08/2022 10:05:00 |
2 | 150 | 203 | 3003 | 06/10/2022 12:05:00 |
3 | 200 | 105 | 3002 | 06/08/2022 10:15:00 |
4 | 400 | 473 | 3002 | 06/18/2022 20:05:00 |
5 | 150 | 203 | 3001 | 06/10/2022 14:00:00 |
The objective is to write a SQL query that will provide the overall click-through conversion rate.
The above PostgreSQL query first calculates the total number of ad clicks. Then, it calculates the successful clicks - i.e., the instances where a click on an ad was followed by adding the same product to the cart during the same session. Finally, it calculates the conversion rate by dividing the number of successful clicks by the total number of clicks.
To solve a similar SQL problem on DataLemur's free interactive coding environment, solve this Facebook SQL Interview question:
The EXCEPT operator is used to return all rows from the first SELECT statement that are not returned by the second SELECT statement. Note that is available in PostgreSQL and SQL Server, and it's equivalent operator is called and is available in MySQL and Oracle.
For a tangible example, suppose you were doing an HR Analytics project for Adevinta, and had access to Adevinta's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables. You could use operator to find all employees who never were a contractor using this query:
Given the tables and , find out the average listing price per category for all the listings made in the year 2022.
The table has the following columns:
The table has the following columns:
listing_id | user_id | category_id | listing_price | listing_date |
---|---|---|---|---|
1 | 101 | 1 | 100.00 | 2022-01-15 |
2 | 102 | 2 | 500.00 | 2022-05-10 |
3 | 103 | 1 | 150.00 | 2022-03-20 |
4 | 104 | 3 | 300.00 | 2022-07-15 |
5 | 105 | 2 | 600.00 | 2022-09-25 |
category_id | category_name |
---|---|
1 | Electronics |
2 | Automotive |
3 | Clothing |
This query joins the and tables on , filters for listings from the year 2022 using the EXTRACT function, and then uses the GROUP BY clause and AVG function to get the average listing price per category.
In Adevinta, the team needs to understand the customer order patterns to improve their sales strategy. Your task is to write a SQL query that will combine the customer database and the relevant order table to return each customer's ID, name, and the total number of orders they have placed.
In our database, each customer is identified by a unique , and each order has a corresponding and .
Sample tables:
customer_id | first_name | last_name |
---|---|---|
1 | John | Doe |
2 | Jane | Smith |
3 | Alice | Johnson |
4 | Bob | Davis |
order_id | customer_id | order_date | item |
---|---|---|---|
100 | 1 | 06/08/2022 | Laptop |
101 | 2 | 06/10/2022 | Mobile phone |
102 | 1 | 06/18/2022 | Headphones |
103 | 3 | 07/26/2022 | Tablet |
104 | 2 | 07/05/2022 | TV |
customer_id | first_name | last_name | total_orders |
---|---|---|---|
1 | John | Doe | 2 |
2 | Jane | Smith | 2 |
3 | Alice | Johnson | 1 |
4 | Bob | Davis | 0 |
In this query, we use the clause to combine the and tables. The clause is used to group the results by customer ID, first name, and last name. The function counts the total number of orders per customer. If a customer hasn't placed any orders, will be 0.
Because joins come up so often during SQL interviews, try this interactive Snapchat JOIN SQL interview question:
There are several advantages to normalizing a database, including less redundancy, more flexibility, and better performance.
Less Redundancy: Normalization reduces redundancy by breaking down a larger, more general table into smaller, more specific tables. This reduces the amount of data that needs to be accessed for queries.
More Flexibility: Normalization makes it easier to modify the structure of the database, as there is less redundancy, so it allows you to make changes to one table without affecting others. This makes it easier to adapt the database to changing business needs (a very real reality at Adevinta!)
Better Performance: Normalization can improve the performance of the database by reducing the amount of data that needs to be stored and accessed in a single table. This can result in faster query times and better overall performance.
Given the table that holds the x and y coordinates for points A and B, calculate the distance between these two points for each record using the Euclidean distance formula: sqrt((x2 - x1)² + (y2 - y1)²). Round the result to 2 decimal places.
id | point_a_x | point_a_y | point_b_x | point_b_y |
---|---|---|---|---|
1 | 7.6 | 3.8 | 4.2 | 9.1 |
2 | 3.5 | 7.3 | 8.9 | 1.6 |
3 | 6.2 | 8.4 | 2.6 | 3.1 |
id | distance |
---|---|
1 | 6.90 |
2 | 8.13 |
3 | 6.44 |
This query calculates the Euclidean distance for each record in the table using the provided coordinates for points A and B. The function is used to square the difference in x and y coordinates between the two points, and the function is used to get the square root of the summation. This result is then rounded to 2 decimal places using the function.
To practice a very similar question try this interactive Alibaba Compressed Mean Question which is similar for using mathematical functions and rounding results or this Google Median Google Search Frequency Question which is similar for employing numerical operations on given data.
The key to acing a Adevinta SQL interview is to practice, practice, and then practice some more! Beyond just solving the above Adevinta SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon.
Each DataLemur SQL question has multiple hints, full answers and best of all, there is an online SQL code editor so you can instantly run your SQL query and have it checked.
To prep for the Adevinta SQL interview it is also a great idea to practice SQL problems from other tech companies like:
However, if your SQL coding skills are weak, forget about diving straight into solving questions – go learn SQL with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers SQL topics like inner vs. outer JOIN and LAG window function – both of which show up often in SQL job interviews at Adevinta.
Beyond writing SQL queries, the other types of questions to prepare for the Adevinta Data Science Interview are:
To prepare for Adevinta Data Science interviews read the book Ace the Data Science Interview because it's got: