At Adobe, SQL is used all the damn time for analyzing customer usage patterns for it's design products, and for managing large-scale customer data as part of the Adobe Experience Platform.
Because of this, Adobe almost always evaluates jobseekers on SQL query questions during interviews for Data Analytics, Data Science, and Data Engineering jobs. That's why we've collected 9 Adobe SQL interview questions to practice, which are similar to recently asked questions at Adobe – can you solve them?
For every customer that bought Photoshop, return a list of the customers, and the total spent on all the products except for Photoshop products.
Sort your answer by customer ids in ascending order.
Column Name | Type |
---|---|
customer_id | integer |
product | string |
revenue | integer |
customer_id | product | revenue |
---|---|---|
123 | Photoshop | 50 |
123 | Premier Pro | 100 |
123 | After Effects | 50 |
234 | Illustrator | 200 |
234 | Premier Pro | 100 |
customer_id | revenue |
---|---|
123 | 150 |
Explanation: User 123 bought Photoshop, Premier Pro + After Effects, spending $150 for those products. We don't output user 234 because they didn't buy Photoshop.
The dataset you are querying against may have different input & output - this is just an example!
Practice this question on DataLemur's free interactive coding environment, solve this Adobe SQL Interview Question:
Adobe has an extensive database of its customers. They often want to focus on active users who are highly engaged with their products. For this question, suppose that Adobe considers an active user as someone who has used any one of their products more than 4 times in a month. Additionally, Adobe is interested in users who provide reviews with at least 4-star ratings.
Given the following sample data tables for users, usage, and reviews, write a query to find the users who meet the above criteria.
user_id | name | sign_up_date |
---|---|---|
123 | John Doe | 05/01/2022 |
265 | Jane Smith | 05/10/2022 |
362 | Alice Johnson | 06/15/2022 |
192 | Bob Brown | 06/20/2022 |
981 | Charlie Davis | 07/01/2022 |
user_id | product | usage_date |
---|---|---|
123 | Photoshop | 06/05/2022 |
123 | Photoshop | 06/06/2022 |
123 | Photoshop | 06/07/2022 |
123 | Photoshop | 06/08/2022 |
123 | Photoshop | 06/09/2022 |
265 | Lightroom | 07/10/2022 |
265 | Lightroom | 07/11/2022 |
265 | Lightroom | 07/12/2022 |
362 | Illustrator | 07/17/2022 |
362 | Illustrator | 07/18/2022 |
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 06/08/2022 | Photoshop | 4 |
5293 | 362 | 07/18/2022 | Illustrator | 5 |
7802 | 265 | 07/12/2022 | Lightroom | 4 |
This query first creates a subquery that groups the usage data by user_id for the current month and counts the distinct dates on which each user logged usage. It then joins this with the users table to extract the names of the users. The outer query then checks that there exists at least one review by each user with a rating of 4 or more.
Cross join and natural join are like two sides of a coin in the world of SQL.
Cross joins is like the wild and reckless cousin who creates a giant new table by combining every row from table A with every row from table B, no questions asked, no common key needed.
Natural joins are like the more refined and selective cousin who only combines rows from multiple tables if they have something in common (i.e., common columns/keys).
While cross join doesn't discriminate and will create a massive table if given the chance, natural join is more selective and only returns a table with the number of rows equal to the number of matching rows in the input tables. So, choose your JOIN wisely!
For a company like Adobe, which is known for its various software subscriptions such as Adobe Acrobat, Photoshop, Illustrator and so on, a good question would be to find the average monthly duration of user subscriptions. This would be especially interesting if Adobe wants to know how long users keep using their software on average.
Let's first create some sample tables:
subscription_id | user_id | product_id | start_date | end_date |
---|---|---|---|---|
1001 | 50 | 3001 | 01/01/2022 | 04/01/2022 |
1002 | 70 | 3002 | 01/02/2022 | 03/01/2022 |
1003 | 90 | 3001 | 01/03/2022 | 01/04/2022 |
1004 | 120 | 3002 | 01/04/2022 | 01/10/2022 |
1005 | 200 | 3003 | 01/05/2022 | 01/08/2022 |
product_id | product_name |
---|---|
3001 | Adobe Photoshop |
3002 | Adobe Acrobat |
3003 | Adobe Illustrator |
Now let's write a SQL query to get the average subscription duration.
This query first joins the subscriptions and products tables based on the product_id. AVG is used to get the average duration of subscriptions for each product. The DATEDIFF function is used to calculate the length of each subscription in days. By grouping by the product name, this query will return the average subscription duration in days for each Adobe product.
To practice a very similar question try this interactive Facebook Active User Retention Question which is similar for focusing on users' activity over time or this Amazon Average Review Ratings Question which is similar for calculating averages over a certain timeframe.
Views are a lot like virtual tables, where you can take a base table and customize it (such as by hiding some data from non-admin users, or removing some random columns/rows based on business requirements).
Here's the PostgreSQL syntax for creating a view based on data in the table:
While both types of databases are used to store data (no duh!), relational databases and non-relational (also known as NoSQL databases) differ in a few important ways, most importantly on the way data is stored. Relational databases use a data model consisting of tables and rows, while NoSQL databases use a variety of data models, including document, key-value, columnar, and graph storage formats.
This added flexibilty makes NoSQL databases great for non-tabular data (like hierarchal data or JSON data), or data where the type/format is constantly evolving. With this added flexibility, comes one big weakness – you won't get ACID-compliance. That means, unlike relational databases which are typically adhere to the ACID properties (atomic, consistent, isolated, and durable), you don't get as strong guarentees with most non-relational databases.
Adobe is interested in analyzing their customer database to understand the buying behavior of their customers. Given a table and an table, write a SQL query to join these two tables together and return the average order value by customers.
customer_id | first_name | last_name | create_date | |
---|---|---|---|---|
101 | John | Doe | john.doe@example.com | 01/01/2020 |
102 | Jane | Doe | jane.doe@example.com | 02/02/2020 |
103 | Bob | Smith | bob.smith@example.com | 03/03/2020 |
104 | Alice | Johnson | alice.johnson@example.com | 04/04/2020 |
order_id | customer_id | order_date | total_amount |
---|---|---|---|
201 | 101 | 05/05/2020 | 200 |
202 | 101 | 06/06/2020 | 300 |
203 | 102 | 07/07/2020 | 400 |
204 | 103 | 08/08/2020 | 500 |
205 | 104 | 09/09/2020 | 600 |
This SQL query joins the table and the table using the field which is common to both tables. The function is used to compute the average total amount of the orders for each customer. The clause is used to group the results by the customer's id, first name and last name. Therefore, the output of the query will show the average spending of each customer.
Because joins come up so often during SQL interviews, try this interactive Snapchat Join SQL question:
Adobe wants to analyze how popular each product is and what its average rating is. The popularity is calculated by taking the total product reviews and applying the square root, then rounding to the nearest whole number.
They want to rank products based on popularity and average rating. An important condition is that products with less than 10 reviews should be excluded from the ranking, but their average rating still needs to be calculated.
Assuming you have data structured as shown, write a SQL query that returns a table with columns for , , and .
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 06/08/2022 00:00:00 | 50001 | 4 |
7802 | 265 | 06/10/2022 00:00:00 | 69852 | 4 |
5293 | 362 | 06/18/2022 00:00:00 | 50001 | 3 |
6352 | 192 | 07/26/2022 00:00:00 | 69852 | 3 |
5455 | 891 | 07/12/2022 00:00:00 | 50001 | 5 |
5831 | 654 | 08/23/2022 00:00:00 | 69852 | 2 |
6661 | 892 | 06/21/2022 00:00:00 | 50001 | 1 |
This SQL query first groups the reviews data by and calculates the and for each product. For the , it first counts the number of reviews for each product, takes the square root of that, and then rounds it to the nearest whole number. It applies a condition to only calculate the if there are at least 10 reviews. It calculates the by taking the average of and rounding it to 2 decimal places. Lastly, the results are ordered by (from highest to lowest) and then by (from highest to lowest).
To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for calculating average ratings or this Amazon Highest-Grossing Items Question which is similar for ranking products.
p.s. in case this question was tricky, go re-learn SQL with this SQL tutorial for Data Analytics which has 30+ lessons including one on SQL math functions.
The best way to prepare for a Adobe SQL interview is to practice, practice, practice. Besides solving the earlier Adobe SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Facebook, Google and unicorn tech startups.
Each interview question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and best of all, there's an interactive coding environment so you can right in the browser run your SQL query and have it executed.
To prep for the Adobe SQL interview it is also helpful to practice SQL problems from other tech companies like:
However, if your SQL foundations are weak, don't worry about jumping right into solving questions – improve your SQL foundations with this DataLemur SQL Tutorial.
This tutorial covers SQL topics like RANK vs. DENSE RANK and handling timestamps – both of these come up often during Adobe SQL interviews.
For the Adobe Data Science Interview, beyond writing SQL queries, the other types of questions to prepare for are:
To prepare for Adobe Data Science interviews read the book Ace the Data Science Interview because it's got: