At ODDITY Tech, SQL is typically used for analyzing customer behavioral data to find trends, which informs the companies development of beauty & wellness products. That's why ODDITY Tech often tests SQL questions during interviews for Data Science and Data Engineering positions.
As such, to help you ace the ODDITY Tech SQL interview, here’s 9 ODDITY Tech SQL interview questions – can you solve them?
Suppose you are given a table called that records all the sales transactions at ODDITY Tech. Your task is to write a PostgreSQL query that returns a new table that shows the cumulative sum of products sold each day.
The table is defined as follows:
sale_id | product_id | sale_date | number_sold |
---|---|---|---|
1001 | 600 | 2022-09-2 | 5 |
1002 | 500 | 2022-09-2 | 3 |
1003 | 600 | 2022-09-3 | 2 |
1004 | 500 | 2022-09-3 | 1 |
1005 | 600 | 2022-09-4 | 6 |
In the output, each row contains the , and the cumulative sum () of up to that date (inclusive) for that product.
For example, the entry for 2022-09-2, product 600 should have a of 5, and the entry for the same product on 2022-09-3 should have a of 7 (5 from previous day + 2 from current day).
Your output should be ordered by and then by .
sale_date | product_id | cumulative_sum |
---|---|---|
2022-09-2 | 500 | 3 |
2022-09-3 | 500 | 4 |
2022-09-2 | 600 | 5 |
2022-09-3 | 600 | 7 |
2022-09-4 | 600 | 13 |
You can solve this problem using a window function. Here is the PostgreSQL query:
In this query, the window function computes a cumulative sum of for each product () over the days in ascending order. The resulting table will display for each day, for each product, the cumulative sum of the product sold up to that date.
Pro Tip: Window functions are a popular SQL interview topic, so practice all the window function problems on DataLemur
ODDITY Tech is a growing tech company that has a range of different products available. They sell their products globally and are interested in understanding which of their products are performing the best in terms of sales.
As their database specialist, your task is to provide a query which should return the product with the most units sold. Assume you are given two different tables and .
order_id | product_id | order_date | units_sold |
---|---|---|---|
1023 | 123 | 06/08/2022 | 15 |
1594 | 567 | 06/10/2022 | 10 |
2015 | 123 | 06/18/2022 | 20 |
3030 | 890 | 06/28/2022 | 5 |
4098 | 123 | 07/02/2022 | 30 |
product_id | product_name |
---|---|
123 | Armageddon Mouse |
567 | Phantom Keyboard |
890 | Specter Monitor |
Your query should return the product Id, product name, and the total units sold.
This query joins the and tables on the field. It then sums up the for each product using the aggregate function. The result is grouped by and to give the total for each distinct product. The results are then ordered in descending order of and the top 1 result, which represents the product with the most units sold, is returned.
Database normalization is the process of breaking down a table into smaller and more specific tables and defining relationships between them via foreign keys. This minimizes redundancy, and creates a database that's more flexible, scalable, and easier to maintain. It also helps to ensure the integrity of the data by minimizing the risk of data inconsistencies and anomalies.
ODDITY Tech is an E-commerce company that sells various types of digital products. The company places digital ads and monitors click-through-rates. Recently, they launched a new product and built a new landing page. They want to determine the effectiveness of this page. The question here is to write a SQL query that calculates the click-through-rate (CTR) -- tracked from the ads clicked to reaching the landing page, and then the conversion rate which is from reaching the landing page to actually adding the product to the cart.
Here are the two tables with fictional data that might be useful for this question:
click_id | user_id | click_date | ad_id | landing_page_id |
---|---|---|---|---|
1213 | 452 | 08/11/2022 00:00:00 | 5001 | 3001 |
2435 | 978 | 08/12/2022 00:00:00 | 5002 | 3002 |
3982 | 635 | 08/16/2022 00:00:00 | 5001 | 3001 |
5101 | 457 | 08/19/2022 00:00:00 | 5002 | 3002 |
6598 | 252 | 08/20/2022 00:00:00 | 5001 | 3001 |
conversion_id | user_id | conversion_date | product_id | cart_id |
---|---|---|---|---|
1310 | 452 | 08/11/2022 00:15:00 | 4001 | 2001 |
2537 | 978 | 08/12/2022 00:20:00 | 4002 | 2002 |
3932 | 635 | 08/16/2022 05:05:00 | 4001 | 2001 |
5230 | 457 | 08/19/2022 08:00:00 | 4002 | 2002 |
6691 | 252 | 08/20/2022 10:35:00 | 4001 | 2001 |
The above SQL query first joins the two tables on the "user_id" column, with a condition of the conversion event happening after the click event. It then calculates both click count and conversion count, as well as the conversion rate (conversion count / click count * 100%).
Results may vary from time to time due to the new queries and data added. As per the given data, the query calculates the click-through rate from the clicks to the new product's landing page, and the conversion rate from the page to adding the product to the cart. The CTR and conversion rates are useful metrics for understanding the effectiveness of digital advertising and landing page design.
To solve a related SQL problem on DataLemur's free interactive coding environment, solve this Meta SQL interview question:
The function can take in multiple paramaters, and returns the first input paramater that is not null. If all arguments are null, the COALESCE function will return null too.
Suppose you have a table of ODDITY Tech salespeople and the number of deals they closed. If a salesperson didn't close any deals, the data from the 3rd-party CRM system exports a NULL value.
sales_person | closed_deals |
---|---|
Jason Wright | NULL |
Drew Jackson | 3 |
Chris Ho | NULL |
Adam Cohen | 2 |
Samantha Perez | 4 |
To change these NULLs to zeros, you can use the function in the following way:
This would result in the following data:
sales_person | closed_deals |
---|---|
Jason Wright | 0 |
Drew Jackson | 3 |
Chris Ho | 0 |
Adam Cohen | 2 |
Samantha Perez | 4 |
ODDITY Tech wants to conduct a marketing campaign targeting customers who belong to tech companies. In order to identify the customers to target, we need you to compile a list of all customers with the string 'tech' in their email addresses.
We standardize our email address as 'firstname.lastname@company.com'. Thus, if the 'tech' string is in the company's name, it means that customer is associated with a tech company.
We provide you with a table containing customer records for this task, structured as follows:
customer_id | firstname | lastname | email_address |
---|---|---|---|
101 | Jane | Doe | jane.doe@techstars.com |
102 | John | Smith | john.smith@odditytech.com |
103 | Mary | Johnson | mary.johnson@apple.com |
104 | James | Brown | james.brown@microsoft.com |
105 | Patricia | Williams | patricia.williams@google.com |
This query uses the operator to filter out rows where the column contains the string 'tech'. The '%' character is a wildcard in SQL that matches any sequence of characters. Therefore, '%tech%' matches any that contains 'tech' anywhere within it.
It should provide an output similar to this:
customer_id | firstname | lastname | email_address |
---|---|---|---|
101 | Jane | Doe | jane.doe@techstars.com |
102 | John | Smith | john.smith@odditytech.com |
The list will include only customers tied to tech companies, making it easier for ODDITY Tech to conduct their focused marketing campaign.
A UNIQUE constraint ensures that all values in a column are different. This provides uniqueness for the column(s) and helps identify each row uniquely. Unlike primary key, there can be multiple unique constraints defined per table.
As an analyst at ODDITY Tech, you are tasked with analyzing the company's database. You need to understand how customers interact with the products by joining two tables, and .
The table contains records of all customers with columns as , , , and . The table, on the other hand, stores information related to all the products that ODDITY Tech provides with columns like , , , and . You are also given an table that records which customers bought what product. The table has the following columns: , , , and .
Write a SQL query to list all customers along with the details of all products they purchased, including the product category and price.
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@gmail.com |
2 | Jane | Smith | jane.smith@gmail.com |
product_id | product_name | product_category | price |
---|---|---|---|
101 | iPad | Electronics | 699 |
201 | T-shirt | Apparel | 25 |
order_id | customer_id | product_id | order_date |
---|---|---|---|
1001 | 1 | 101 | 2022-10-01 |
1002 | 2 | 201 | 2022-10-02 |
This SQL query uses the JOIN clause to combine rows from three tables based on related columns. Here the related columns are in the and tables, and in the and tables. It lists all the customers along with the details of the products they purchased, showing product category and price for each product.
Because joins come up frequently during SQL interviews, take a stab at this interactive Snapchat SQL Interview question using JOINS:
ODDITY Tech is planning a sale for their store, they will be discounting the price of their products based on the product category. The company provided two tables, which contains the product's original price and which contains the category and corresponding discount percentage.
The task is to come up with a query that determines the new price after discount for each product. As part of the exercise, calculate also the savings for each item i.e., the difference between the original price and the discounted price. Round the output prices to 2 decimal places.
item_id | category | price |
---|---|---|
1 | Electronics | 109.99 |
2 | Books | 24.99 |
3 | Fashion | 49.99 |
4 | Home | 79.99 |
5 | Electronics | 199.99 |
category | discount% |
---|---|
Electronics | 10 |
Books | 5 |
Fashion | 20 |
Home | 15 |
item_id | category | original_price | discounted_price | savings |
---|---|---|---|---|
1 | Electronics | 109.99 | 98.99 | 11.00 |
2 | Books | 24.99 | 23.74 | 1.25 |
3 | Fashion | 49.99 | 39.99 | 10.00 |
4 | Home | 79.99 | 67.99 | 12.00 |
5 | Electronics | 199.99 | 179.99 | 20.00 |
In the context of PostgreSQL, the above query begins by selecting the required columns from the and tables. It then calculates the discounted price by subtracting the product of item price and discount (divided by 100) from the original price. This is to establish the discounted price. The ROUND function is incorporated to render the discounted price and savings up to 2 decimal points. This query will output the item's category, original price, discounted price, and savings. An inner join is employed to merge the and table based upon a common column, 'category'.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for product category relation or this CVS Health Pharmacy Analytics (Part 1) Question which is similar for calculating profit.
The key to acing a ODDITY Tech SQL interview is to practice, practice, and then practice some more! Beyond just solving the earlier ODDITY Tech SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Google, Microsoft and Silicon Valley startups.
Each problem on DataLemur has hints to guide you, step-by-step solutions and best of all, there is an online SQL coding environment so you can instantly run your SQL query and have it graded.
To prep for the ODDITY Tech SQL interview it is also wise to practice SQL questions from other tech companies like:
However, if your SQL coding skills are weak, don't worry about jumping right into solving questions – improve your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers SQL topics like advantages of CTEs vs. subqueries and aggregate functions – both of these come up routinely in SQL interviews at ODDITY Tech.
In addition to SQL interview questions, the other question categories to practice for the ODDITY Tech Data Science Interview are:
The best way to prepare for ODDITY Tech Data Science interviews is by reading Ace the Data Science Interview. The book's got: