At ZOZO, SQL is often used for analyzing customer shopping habits and trends, and managing vast databases of product and user data for targeted marketing efforts. For this reason ZOZO almost always evaluates jobseekers on SQL problems during interviews for Data Analytics, Data Science, and Data Engineering jobs.
To help you prepare for the ZOZO SQL interview, we've curated 10 ZOZO SQL interview questions – able to answer them all?
At ZOZO, our fashion business views high-value or "VIP" customers as those who spend a large amount on our produced goods and do so consistently over time. As a data analyst, your task is to write a SQL query to identify these VIP customers from the past year.
Your information sources are two tables 'orders' and 'users', defined as follows:
Example Input:
order_id | user_id | purchase_date | product_id | order_value |
---|---|---|---|---|
1035 | 45 | 01/25/2021 | 7500 | 300 |
2072 | 112 | 02/14/2021 | 8900 | 120 |
3028 | 45 | 02/15/2021 | 7650 | 450 |
4556 | 78 | 03/19/2021 | 7800 | 100 |
5078 | 23 | 04/12/2021 | 7980 | 280 |
Example Input:
user_id | signup_date | location | age |
---|---|---|---|
45 | 01/08/2018 | Tokyo | 37 |
112 | 03/22/2019 | Osaka | 24 |
78 | 02/05/2020 | Nagoya | 30 |
23 | 04/11/2021 | Kyoto | 45 |
Based on the order value, a VIP customer is defined as a user who has spent more than 2000 in total over the past year (2021).
This query joins the 'orders' and 'users' tables on the common column 'user_id', filters out orders beyond the last year, then groups results by 'user_id'. The HAVING clause is used to only include users whose total order value exceeds 2000. The resulting table will include user_id, location, age, and the total amount spent by each VIP user.
To work on another SQL customer analytics question where you can solve it interactively and have your SQL query instantly graded, try this Walmart Labs SQL Interview Question:
For ZOZO, a popular online fashion retailer, we have a table that catalogues user reviews for its various products. Our employer wants to analyze how each of product's rating has changed over time on a month-to-month basis.
To achieve this, you are asked to write a SQL query to calculate the average rating for each product per month.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-06-08 | 50001 | 4 |
7802 | 265 | 2022-06-10 | 69852 | 4 |
5293 | 362 | 2022-06-18 | 50001 | 3 |
6352 | 192 | 2022-07-26 | 69852 | 3 |
4517 | 981 | 2022-07-05 | 69852 | 2 |
Expected Output:
mth | product_id | avg_stars |
---|---|---|
6 | 50001 | 3.50 |
6 | 69852 | 4.00 |
7 | 69852 | 2.50 |
Use PostgreSQL's to extract the month part from the date, and then the function to get the average stars.
This SQL query first splits the data from the table into groupings based on the month of the and the . It then calculates the average () number of in each of these groupings, rounding the results to two decimal places. The clause then orders the results first by month (), and then by .
To solve a related window function SQL problem on DataLemur's free interactive SQL code editor, solve this Amazon SQL Interview Question:
A is a field in a table that references the of another table. It creates a link between the two tables and ensures that the data in the field is valid.
Say for example you had sales analytics data from ZOZO's CRM (customer-relationship management) tool.
The FOREIGN KEY constraint ensures that the data in the field of the "opportunities" table is valid, and prevents the insertion of rows in the table that do not have corresponding entries in the table. It also helps to enforce the relationship between the two tables and can be used to ensure that data is not deleted from the accounts table if there are still references to it in the opportunities` table.
ZOZO is an international online fashion retailer. They sell a variety of items like clothing, footwear, accessories, and more. The company would like to understand customer purchasing behavior to improve their product offerings and customer experience. As such, they need to track information about customers, products, and orders.
Design a database schema for ZOZO's e-commerce platform consisting of three tables:
Finally, write a SQL query that provides the total revenue generated from each category for the year 2022.
customer_id | name | |
---|---|---|
1 | John Doe | john.doe@email.com |
2 | Jane Smith | jane.smith@email.com |
3 | Emily Johnson | emily.johnson@email.com |
product_id | name | price | category |
---|---|---|---|
1 | Blue Jeans | 50 | Clothing |
2 | Black Boots | 100 | Footwear |
3 | Leather Jacket | 200 | Clothing |
order_id | customer_id | product_id | quantity | order_date |
---|---|---|---|---|
1 | 1 | 1 | 2 | 2022-01-10 |
2 | 2 | 2 | 1 | 2022-02-20 |
3 | 3 | 3 | 1 | 2022-03-30 |
4 | 1 | 3 | 1 | 2022-04-10 |
5 | 2 | 1 | 3 | 2022-05-20 |
This query first joins the table with the table on to bring all relevant product and order details into one table. Then, it filters out the orders made in the year 2022 using the condition. Finally, it groups the results by product category and calculates the total revenue generated from each category by multiplying the product price with the quantity ordered and summing up the results.
In SQL, zero's are numerical values which can be used in calculations and comparisons just like any other number. A blank space, also known as an empty string, is a character value and can be used in character manipulation functions and comparisons.
NULLs aren't the same as zero's or blank spaces. NULLs represent unkonwn, missing, or not applicable values. They are not included in calculations and comparisons involving NULL values always result in NULL.
The ZOZO company is an international clothing retailer with several distinct categories of products. Your task is to determine the average sales per each product category for the last year.
===== Example Input:
sale_id | date | category | product_id | price |
---|---|---|---|---|
9871 | 05/01/2022 | Dress | 1001 | 200 |
8762 | 06/10/2022 | Dress | 1002 | 150 |
6543 | 06/15/2022 | Pants | 2001 | 80 |
7212 | 07/09/2022 | Pants | 2002 | 95 |
4837 | 08/11/2022 | Shirt | 3001 | 50 |
This query will divide the total sales (prices of products sold) per category by the total number of sales, giving us the average sale per category for each month. The function is used to get the month from the date, and is a typical SQL function to get the average of a number of values. We then use to category to get separate results for each category.
A cross-join, also known as a cartesian join, is a type of join that returns the cross-product of two tables. In a cross-join, each row from the first table is paired with every row from the second table, resulting in a new table that has a row for each possible combination of rows from the two input tables.
For example, say you worked on the Marketing Analytics team at ZOZO, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).
Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:
You could this query to generate all possible combinations of ad copy and ad creative to help you create more effective ads for ZOZO. Just be careful: if you had 1,000 rows of ad copy and 1,000 different image creatives, the resulting cross-join would have 1 million rows!
As an IT analyst at ZOZO, one of your tasks is to manage the customer records database. You are particularly interested in customers who have used their 'zozo.com' email when registering with the company. Write a SQL query to find all customers whose email ends with 'zozo.com'.
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@zozo.com |
2 | Jane | Smith | jane.smith@gmail.com |
3 | Robert | Johnson | r.johnson@zozo.com |
4 | Sarah | Davis | s.davis@yahoo.com |
5 | James | Brown | jbrown@zozo.com |
In PostgreSQL, you would use the keyword to filter rows that match a specific pattern. In this case, your task is to find all records where the email ends with 'zozo.com'. Here's how you would write this query:
This query returns all columns for those rows in the 'customers' table where the 'email' ends with 'zozo.com'. The percent sign (%) is a wildcard character in SQL that matches any sequence of characters.
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@zozo.com |
3 | Robert | Johnson | r.johnson@zozo.com |
5 | James | Brown | jbrown@zozo.com |
The result shows three customers that used their 'zozo.com' email for registration. It can be used for further analysis or customer segmentation.
As a data analyst for the ZOZO company, we would like to analyze customer purchasing behavior to guide our future business decisions. Certain products are more popular among specific customer groups. Therefore, a crucial analysis is to see the number of each product brought by females and males.
The "customers" table stores basic personal information, including gender. The "orders" table stores purchasing information, including product id.
Use SQL queries in PostgreSQL to calculate the number of each product purchased by males and females. The output needs to show product_id, the number of males who bought it, and the number of females who bought it.
For the purposes of this question, consider we have the following example dataset:
customer_id | gender |
---|---|
123 | Male |
265 | Female |
362 | Male |
192 | Male |
981 | Female |
order_id | customer_id | product_id |
---|---|---|
501 | 123 | 36 |
502 | 265 | 24 |
503 | 362 | 36 |
504 | 192 | 24 |
505 | 981 | 36 |
This SQL query first performs a JOIN operation on the "orders" and "customers" tables using the "customer_id" field as a common key. It then uses two COUNT DISTINCT functions combined with CASE WHEN clauses to evaluate the gender of each customer. The result will be the count of different products purchased by males and females.
Because join questions come up routinely during SQL interviews, try an interactive SQL join question from Spotify:
A non-relational (NoSQL) database is any database that does not use the typical tabular format of rows and columns like in relational databases.
While knowing the four different types of NoSQL databases is beyond the scope of most Data Analyst & Data Science interviews, Data Engineers at ZOZO should vaguely refresh these concepts:
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the ZOZO SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the above ZOZO SQL interview questions, you should also solve the 200+ SQL Interview Questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon.
Each SQL question has multiple hints, fully explained answers along with a discussion board to see how others solved it and most importantly, there's an interactive SQL code editor so you can instantly run your SQL query answer and have it graded.
To prep for the ZOZO SQL interview it is also helpful to solve SQL questions from other tech companies like:
In case your SQL foundations are weak, don't worry about going right into solving questions – improve your SQL foundations with this free SQL for Data Analytics course.
This tutorial covers topics including LEAD window function and sorting data with ORDER BY – both of which come up frequently in ZOZO SQL assessments.
Besides SQL interview questions, the other types of problems to prepare for the ZOZO Data Science Interview are:
To prepare for ZOZO Data Science interviews read the book Ace the Data Science Interview because it's got: