At Etsy, SQL is used day-to-day for analyzing customer shopping patterns, and managing inventory data to track popular products and forecast sales trends. Because of this, Etsy almost always asks SQL coding questions during interviews for Data Analyst, Data Science, and BI jobs.
As such, to help you prepare for the Etsy SQL interview, we've curated 8 Etsy SQL interview questions – able to answer them all?
You are given the two tables containing information on Etsy’s user signups and purchases. Write a query to obtain the percentage of users who signed up and made a purchase within 7 days of signing up. The result should be rounded to the nearest 2 decimal places.
signups Table:| Column Name | Type |
|---|---|
| user_id | integer |
| signup_date | datetime |
signups Example Input:| user_id | signup_date |
|---|---|
| 445 | 06/21/2022 12:00:00 |
| 742 | 06/19/2022 12:00:00 |
| 648 | 06/24/2022 12:00:00 |
| 789 | 06/27/2022 12:00:00 |
| 123 | 06/27/2022 12:00:00 |
user_purchases Table:| Column Name | Type |
|---|---|
| user_id | integer |
| product_id | integer |
| purchase_amount | decimal |
| purchase_date | datetime |
user_purchases Example Input:| user_id | product_id | purchase_amount | purchase_date |
|---|---|---|---|
| 244 | 7575 | 45.00 | 06/22/2022 12:00:00 |
| 742 | 1241 | 50.00 | 06/28/2022 12:00:00 |
| 648 | 3632 | 55.50 | 06/25/2022 12:00:00 |
| 123 | 8475 | 67.30 | 06/29/2022 12:00:00 |
| 244 | 2341 | 74.10 | 06/30/2022 12:00:00 |
| single_purchase_pct |
|---|
| 40.00 |
This is the same question as problem #29 in the SQL Chapter of Ace the Data Science Interview!
The PostgreSQL query to solve this problem is:
SELECT ROUND( 100.0 * COUNT(DISTINCT purchases.user_id) / COUNT(DISTINCT signups.user_id), 2) AS same_week_purchases_pct FROM signups LEFT JOIN user_purchases AS purchases ON signups.user_id = purchases.user_id WHERE purchases.purchase_date IS NULL OR (purchases.purchase_date BETWEEN signups.signup_date AND (signups.signup_date + '7 days'::INTERVAL));
To practice this question on DataLemur's free interactive SQL code editor, solve this Etsy SQL question asked in a BI Engineer interview:
This is the same question as problem #9 in the SQL Chapter of Ace the Data Science Interview!
Assume you're given a table containing Etsy user transactions. Write a query that retrieves the customers whose first transaction was valued at $50 or more. Output the total number of users who meet this criteria.
user_transactions Table:| Column Name | Type |
|---|---|
| transaction_id | integer |
| user_id | integer |
| spend | decimal |
| transaction_date | timestamp |
user_transactions Example Input:| transaction_id | user_id | spend | transaction_date |
|---|---|---|---|
| 759274 | 111 | 49.50 | 02/03/2022 00:00:00 |
| 850371 | 111 | 51.00 | 03/15/2022 00:00:00 |
| 615348 | 145 | 36.30 | 03/22/2022 00:00:00 |
| 137424 | 156 | 151.00 | 04/04/2022 00:00:00 |
| 248475 | 156 | 87.00 | 04/16/2022 00:00:00 |
| users |
|---|
| 1 |
WITH ranked_purcases_cte AS ( SELECT user_id, spend, RANK() OVER ( PARTITION BY user_id ORDER BY transaction_date ASC) AS ranking FROM user_transactions) SELECT COUNT(DISTINCT user_id) AS users FROM ranked_purcases_cte WHERE ranking = 1 AND spend >= 50;
Try this question on DataLemur's interactive coding environment, and get access to hints and more!
Constraints are used to specify the rules concerning data in the table. It can be applied for single or multiple fields in an SQL table during the creation of the table or after creating using the ALTER TABLE command. The constraints are:
Say you were storing sales analytyics data from Etsy's CRM inside a database. Here's some example constraints you could use:
PRIMARY KEY constraint: You might use a PRIMARY KEY constraint to ensure that each record in the database has a unique identifier. For example, you could use the "opportunity_id" field as the primary key in the "opportunities" table.
FOREIGN KEY constraint: You might use a FOREIGN KEY constraint to link the data in one table to the data in another table. For example, you could use a foreign key field in the "opportunities" table to reference the "account_id" field in the "accounts" table.
NOT NULL constraint: You might use a NOT NULL constraint to ensure that a field cannot contain a NULL value. For example, you could use a NOT NULL constraint on the "opportunity_name" field in the "opportunities" table to ensure that each opportunity has a name.
UNIQUE constraint: You might use a UNIQUE constraint to ensure that the data in a field is unique across the entire table. For example, you could use a UNIQUE constraint on the "email" field in the "contacts" table to ensure that each contact has a unique email address.
CHECK constraint: You might use a CHECK constraint to ensure that the data in a field meets certain conditions. For example, you could use a CHECK constraint to ensure that the "deal_probability" field in the "opportunities" table is a value between 0 and 100.
DEFAULT constraint: You might use a DEFAULT constraint to specify a default value for a field. For example, you could use a DEFAULT constraint on the "stage" field in the "opportunities" table to set the default value to "prospecting"
Consider you have access to two tables in Etsy's database, Users and Reviews. The Users table contains information about each unique customer, including first and last names, along with a unique user_id. The Reviews table includes customer reviews, with a review_id, user_id (matching that in the Users table), submission date (submit_date), product_id, and stars (rating)—they reflect the review left by a customer for a particular product.
Write a SQL query to calculate the average rating (stars) for each product on a monthly basis.
Users Example Input:| user_id | first_name | last_name |
|---|---|---|
| 123 | Sarah | Smith |
| 265 | John | Doe |
| 362 | Lisa | Nguyen |
| 192 | Mohammed | Ali |
| 981 | Samantha | Brown |
Reviews Example Input:| review_id | user_id | submit_date | product_id | stars |
|---|---|---|---|---|
| 6171 | 123 | 2022-06-08 | 50001 | 4 |
| 7802 | 265 | 2022-06-10 | 69852 | 4 |
| 5293 | 362 | 2022-06-18 | 50001 | 3 |
| 6352 | 192 | 2022-07-26 | 69852 | 3 |
| 4517 | 981 | 2022-07-05 | 69852 | 2 |
| mth | product | avg_stars |
|---|---|---|
| 6 | 50001 | 3.50 |
| 6 | 69852 | 4.00 |
| 7 | 69852 | 2.50 |
SELECT EXTRACT(MONTH FROM submit_date) AS mth, product_id AS product, AVG(stars) AS avg_stars FROM Reviews GROUP BY mth, product ORDER BY mth, avg_stars DESC;
The query groups Reviews by month and product, then calculates the average rating for each grouped category. The output is sorted by month and then by average rating in descending order.
Since joins come up so often during SQL interviews, try an interactive Spotify JOIN SQL question:
UNIQUE SQL constraint?A UNIQUE constraint ensures that all values in a column are different. It is often used in conjunction with other constraints, such as NOT NULL, to ensure that the data meets certain conditions.
For example, if you had Etsy employee data stored in a database, here's some constraints you'd use:
CREATE TABLE etsy_employees ( employee_id INTEGER PRIMARY KEY, first_name VARCHAR(255) NOT NULL, last_name VARCHAR(255) NOT NULL, email VARCHAR(255) NOT NULL UNIQUE, department VARCHAR(255) NOT NULL );
In the Etsy employee example, the UNIQUE constraint is applied to the "email" field to ensure that each employee has a unique email address. This helps to ensure the integrity of the data in the database and prevents errors that could occur if two employees had the same email address.
Given a table of product sales data, write a SQL query that calculates the modified sales, rounded total sales, and square root of the monthly sales for each product, where modified sales are defined as the absolute difference between the total sales and 25% of the sales. The query should use the ABS(), ROUND(), and SQRT() functions.
sales Example Input:| id | product_id | sale_date | sales |
|---|---|---|---|
| 1 | 100 | 2022-06-01 | 25 |
| 2 | 101 | 2022-06-02 | 30 |
| 3 | 100 | 2022-06-03 | 45 |
| 4 | 101 | 2022-07-10 | 55 |
| 5 | 100 | 2022-07-15 | 60 |
SELECT product_id, TO_CHAR(sale_date, 'YYYY-MM') as month_year, ABS(SUM(sales) - 0.25 * SUM(sales)) as modified_sales, ROUND(SUM(sales)) as total_sales, SQRT(SUM(sales)) as sqrt_sales FROM sales GROUP BY product_id, month_year;
In this query, we first group the data by product_id and month_year, then for each group, we use the SQL functions as the following:
We use ABS() to calculate the absolute difference between the total sales (SUM(sales)) and 25% of the total sales (0.25 * SUM(sales)), defining the modified sales.
We use ROUND() to round the total sales (SUM(sales)).
We use SQRT() to calculate the square root of the total sales (SUM(sales)).
This gives us per-product and per-month the modified, rounded, and square-rooted sales figures.
In SQL, a join is used to combine rows from different tables based on a shared key or set of keys, resulting in a single merged table of data.
There are four distinct types of JOINs: INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
(INNER) JOIN: Retrieves records that have matching values in both tables involved in the join.
SELECT * FROM Table_A JOIN Table_B; SELECT * FROM Table_A INNER JOIN Table_B;
LEFT (OUTER) JOIN: Retrieves all the records/rows from the left and the matched records/rows from the right table.
SELECT * FROM Table_A A LEFT JOIN Table_B B ON A.col = B.col;
RIGHT (OUTER) JOIN: Retrieves all the records/rows from the right and the matched records/rows from the left table.
SELECT * FROM Table_A A RIGHT JOIN Table_B B ON A.col = B.col;
FULL (OUTER) JOIN: Retrieves all the records where there is a match in either the left or right table.
SELECT * FROM Table_A A FULL JOIN Table_B B ON A.col = B.col;
Etsy, an e-commerce website focused on handmade or vintage items and craft supplies, maintains a rating system where users review and rate the products they purchase. In the reviews table, each row signifies a product review. This table includes columns like review_id, user_id, submit_date, product_id, and stars (out of 5).
The submit_date column is of TIMESTAMP type, which means it not only contains the date, but also the exact time a review was submitted.
You are asked to write a SQL query to calculate the average star rating for each product per month. Output should contain the month, product_id, and average rating.
reviews Example Input:| review_id | user_id | submit_date | product_id | stars |
|---|---|---|---|---|
| 6171 | 123 | 2022-06-08 00:00:00 | 50001 | 4 |
| 7802 | 265 | 2022-06-10 00:00:00 | 69852 | 4 |
| 5293 | 362 | 2022-06-18 00:00:00 | 50001 | 3 |
| 6352 | 192 | 2022-07-26 00:00:00 | 69852 | 3 |
| 4517 | 981 | 2022-07-05 00:00:00 | 69852 | 2 |
You can extract the month from the submit_date timestamp using the EXTRACT function in PostgreSQL. Then you can group the records by month and product to calculate the average rating. Here is the query:
SELECT EXTRACT(MONTH FROM submit_date) AS mth, product_id AS product, ROUND(AVG(stars), 2) AS avg_stars FROM reviews GROUP BY mth, product ORDER BY mth, product;
This SQL query will solve the problem by first segregating the data both by product ID and the month the review was submitted. It averages the star ratings of each group and rounds it to two decimal places. The result list is then ordered first by month (mth) and then by product to ensure a tidy output.
| mth | product | avg_stars |
|---|---|---|
| 6 | 50001 | 3.50 |
| 6 | 69852 | 4.00 |
| 7 | 69852 | 2.50 |
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Etsy SQL interview is to solve as many practice SQL interview questions as you can!
In addition to solving the earlier Etsy SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Google, Uber, and Microsoft.
Each interview question has hints to guide you, full answers and most importantly, there's an online SQL code editor so you can right in the browser run your SQL query and have it executed.
To prep for the Etsy SQL interview you can also be wise to practice SQL questions from other tech companies like:
But if your SQL skills are weak, forget about diving straight into solving questions – improve your SQL foundations with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers SQL concepts such as WHERE vs. HAVING and aggregate functions – both of these show up routinely in SQL interviews at Etsy.
In addition to SQL interview questions, the other types of problems to practice for the Etsy Data Science Interview are:
The best way to prepare for Etsy Data Science interviews is by reading Ace the Data Science Interview. The book's got: