PVH Corp employees use SQL daily for trend prediction, and managing customer databases for targeted marketing campaigns. For this reason PVH Corp often tests SQL coding questions in interviews for Data Analyst, Data Science, and BI jobs.
Thus, to help you practice, here’s 9 PVH Corp SQL interview questions – able to answer them all?
PVH Corp. is a global apparel company, owning brands like Tommy Hilfiger, Calvin Klein, and many others. A common query might involve analyzing the monthly sales of their different products over a period of time. You are given a dataset containing the sales transactions. Write a SQL query to calculate the total sales per month for each product using window functions. Assume that every transaction in the dataset is completed.
Suppose you have the following transaction
table:
transaction
Example Input:transaction_id | product | transaction_date | amount |
---|---|---|---|
101 | Calvin Klein Shirt | 05/01/2022 | 60 |
102 | Calvin Klein Jeans | 05/01/2022 | 100 |
103 | Tommy Hilfiger Shirt | 05/02/2022 | 50 |
104 | Tommy Hilfiger Jeans | 05/03/2022 | 100 |
105 | Calvin Klein Shirt | 06/04/2022 | 60 |
106 | Tommy Hilfiger Shirt | 06/11/2022 | 50 |
107 | Tommy Hilfiger Jeans | 06/12/2022 | 100 |
108 | Calvin Klein Jeans | 07/01/2022 | 100 |
The output should return the total monthly sales for each product.
SELECT product, EXTRACT(MONTH FROM transaction_date) AS month, SUM(amount) OVER ( PARTITION BY product, EXTRACT(MONTH FROM transaction_date) ORDER BY transaction_date ) as total_sales_monthly FROM transaction
This query partitions the data by product and month, then sums up the sales amount from the beginning of each partition. This provides a rolling sum of sales for each product in each month. The window specification ORDER BY transaction_date
is necessary to perform the cumulative sum in the correct order. EXTRACT is used to get the month part of the transaction_date.
product | month | total_sales_monthly |
---|---|---|
Calvin Klein Shirt | 5 | 60 |
Calvin Klein Shirt | 6 | 60 |
Calvin Klein Jeans | 5 | 100 |
Calvin Klein Jeans | 7 | 100 |
Tommy Hilfiger Shirt | 5 | 50 |
Tommy Hilfiger Shirt | 6 | 50 |
Tommy Hilfiger Jeans | 5 | 100 |
Tommy Hilfiger Jeans | 6 | 100 |
Pro Tip: Window functions are a popular SQL interview topic, so practice all the window function problems on DataLemur
Assume there was a table of PVH Corp employee salary data. Write a SQL query to find all employees who earn more than their direct manager.
employees
Example Input:employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
Test your SQL query for this problem interactively on DataLemur:
First, we perform a SELF-JOIN where we treat the first employee
table (mgr
) as the managers' table and the second employee
table (emp
) as the employees' table. Then we use a WHERE
clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
SELECT emp.employee_id AS employee_id, emp.name AS employee_name FROM employee AS mgr INNER JOIN employee AS emp ON mgr.employee_id = emp.manager_id WHERE emp.salary > mgr.salary;
If the solution above is confusing, you can find a step-by-step solution here: Employees Earning More Than Managers.
A foreign key is a column or group of columns in a table that refers to the primary key in another table. The foreign key constraint helps maintain referential integrity between the two tables.
For example, let's look at the PVH Corp sales database:
pvh_corp_sales:
+------------+------------+------------+------------+ | order_id | product_id | customer_id| quantity | +------------+------------+------------+------------+ | 1 | 222 | 1 | 2 | | 2 | 333 | 1 | 1 | | 3 | 444 | 2 | 3 | | 4 | 555 | 3 | 1 | +------------+------------+------------+------------+
In this table, product_id
and customer_id
could both be foreign keys. They reference the primary keys of other tables, such as a Products table and a Customers table, respectively. This establishes a relationship between the pvh_corp_sales
table and the other tables, such that each row in the sales database corresponds to a specific product and a specific customer.
"
PVH Corp is a global company that owns brands like Tommy Hilfiger, Calvin Klein, Van Heusen, IZOD, and Arrow. As an SQL developer at PVH Corp, you are given a task to write a query that returns the average price of all sold items per brand during the year 2021. Your result should consist of each brand and its corresponding average price in descending order.
Depending on the structure of the database, we'll assume there are two main tables sales
and brands
.
sales
Example Inputsale_id | item_id | brand_id | sale_date | sale_price |
---|---|---|---|---|
1001 | 10 | 1 | 01/04/2021 | 50.00 |
1002 | 20 | 2 | 02/05/2021 | 150.00 |
1003 | 30 | 1 | 03/14/2021 | 70.00 |
1004 | 40 | 2 | 04/18/2021 | 180.00 |
1005 | 50 | 1 | 05/25/2021 | 60.00 |
brands
Tablebrand_id | brand_name |
---|---|
1 | Tommy Hilfiger |
2 | Calvin Klein |
Your output should be:
brand_name | avg_sale_price |
---|---|
Calvin Klein | 165.00 |
Tommy Hilfiger | 60.00 |
SELECT b.brand_name, AVG(s.sale_price) AS avg_sale_price FROM sales s JOIN brands b ON s.brand_id = b.brand_id WHERE EXTRACT(YEAR FROM s.sale_date) = 2021 GROUP BY b.brand_name ORDER BY avg_sale_price DESC;
This query joins the sales
and brands
tables on the brand_id
column to allow for brand names to be included in the output. It filters the sales data to include only the sales from the year 2021 using the WHERE clause with the EXTRACT(YEAR FROM s.sale_date) = 2021
condition. Then, for each brand it uses the AVG function to calculate the average sale price. Finally, results are ordered in descending order by the average sale price.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for calculating metrics for specific categories in descending order or this Wayfair Y-on-Y Growth Rate Question which is similar for calculating yearly aggregation values.
A DBMS (database management system), in order to ensure transactions are relaible and don't ruin the integrity of the data, tries to mantain the following ACID properties: Atomicity, Consistency, Isolation, and Durability.
To make this concept more concrete, here is what each of the ACID properties would mean in the context of banking transactions:
PVH Corp., as a global leader in various segments of the apparel industry, makes extensive use of digital marketing. They are analysing the effectiveness of these digital marketing advertisements. In particular, they want to understand the click-through rate that is brought by these ads.
For a given week, the marketing team gathers data on the unique views of an ad (how many unique users saw the ad), and how many of those views resulted in a click (the user clicked on the ad, leading them to the PVH Corp. product website). They want a SQL query to easily determine the click-through rate, calculated as (number of unique clicks / number of unique views) * 100% for each ad.
ad_events
Sample dataad_id | event | user_id | event_time |
---|---|---|---|
100 | view | 123 | 07/25/2022 14:00:00 |
100 | click | 123 | 07/25/2022 14:01:00 |
100 | view | 456 | 07/25/2022 14:05:00 |
200 | view | 789 | 07/25/2022 15:00:00 |
200 | click | 789 | 07/25/2022 15:01:00 |
200 | view | 321 | 07/25/2022 15:05:00 |
200 | view | 321 | 07/25/2022 15:06:00 |
200 | click | 321 | 07/25/2022 15:07:00 |
SELECT ad_id, ROUND(((COUNT(DISTINCT CASE WHEN event='click' THEN user_id END) * 1.0) / COUNT(DISTINCT CASE WHEN event='view' THEN user_id END)) * 100, 2) AS click_through_rate FROM ad_events GROUP BY ad_id;
This query first counts the unique user IDs for each ad where the event was 'click' and 'view'. Then, it calculates the click-through rate by dividing the count of unique 'click' user IDs by the unique 'view' user IDs and multiplying it by 100 to turn it into a percentage. The ROUND function is used to limit the result to two decimal places. The results are grouped by ad_id to provide click-through rates for each ad.
To practice a similar SQL problem on DataLemur's free interactive coding environment, try this SQL interview question asked by Facebook:
A cross join is a type of JOIN operation in SQL that creates a new table by combining each row from the first table with every row from the second table. It is also known as a cartesian join.
For example, say you worked on the Marketing Analytics team at PVH Corp, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).
Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:
SELECT ad_copy.copy AS ad_copy, ad_creative.image_url AS ad_creative FROM ad_copy CROSS JOIN ad_creative;
A natural join, on the other hand, is a type of JOIN that combines rows from two or more tables based on their common columns. It is called a "natural" join because it is based on the natural relationship that exists between the common columns in the tables being joined.
For an example of each one, say you had sales data exported from PVH Corp's Salesforce CRM stored in a datawarehouse which had two tables: sales
and pvh_corp_customers
.
An INNER JOIN
(which is a type of natural join) combines the two tables on the common customer_id field
SELECT * FROM sales INNER JOIN pvh_corp_customers ON sales.customer_id = pvh_corp_customers.id
This query will return rows from the sales and pvh_corp_customers
tables that have matching customer id values. Only rows with matching customer_id
values will be included in the results.
One main difference between cross joins and natural joins is that cross joins do not require any common columns between the tables being joined, while natural joins do. Another difference is that cross joins can create very large tables if the input tables have a large number of rows, while natural joins will only return a table with the number of rows equal to the number of matching rows in the input tables.
PVH Corp, a leading fashion and lifestyle company, uses a customers
database to store its customer information. They want to keep track of their customer base in New York City for a marketing campaign. Could you write down an SQL query to find all customers whose city
is 'New York City' and address
contains the word 'Street'?
Here's some sample customer data for the context:
customers
Example Input:customer_id | first_name | last_name | address | city | state | |
---|---|---|---|---|---|---|
101 | John | Doe | 1234 Broadway St | New York City | NY | john.doe@example.com |
102 | Jane | Smith | 5678 Main Street | Los Angeles | CA | jane.smith@example.com |
103 | Robert | Johnson | 91011 Wall Street | New York City | NY | robert.johnson@example.com |
104 | Emily | Miller | 1213 Court Street | Chicago | IL | emily.miller@example.com |
105 | Alexandra | Brown | 1415 Carew Street | Fort Wayne | IN | alexandra.brown@example.com |
Here's the SQL query to solve the problem:
SELECT * FROM customers WHERE city = 'New York City' AND address LIKE '%Street%';
I used a combination of exact match condition with WHERE city = 'New York City'
clause and a LIKE clause WHERE address LIKE '%Street%'
. LIKE clause combined with '%Street%' will match any customers whose address contains 'Street'. These customers will be from 'New York City' and whose addresses contain the word 'Street'.
As a data analyst at PVH Corp, you are tasked with analyzing the company's customer purchase history. Your objective is to create a SQL query that retrieves the list of all the customers along with their respective total spending and the most expensive product they have bought. To accomplish this, you need to join two tables: customers
table and products
table.
The customers
table has the following fields:
customer_id
: The unique ID assigned to each customer.customer_name
: The name of the customer.purchase_id
: The IDs of the purchases made by the customer.product_id
: The IDs of the products bought in each purchase.customer_id | customer_name | purchase_id | product_id |
---|---|---|---|
1 | John Doe | 101 | 201 |
2 | Jane Smith | 102 | 202 |
1 | John Doe | 103 | 203 |
3 | Alice Johnson | 104 | 204 |
2 | Jane Smith | 105 | 205 |
The products
table has the following fields:
product_id
: The unique ID assigned to each product.product_name
: The name of the product.price
: The price of the product.product_id | product_name | price |
---|---|---|
201 | T-shirt | $20 |
202 | Jeans | $50 |
203 | Shirt | $30 |
204 | Skirt | $40 |
205 | Jacket | $100 |
You'd use a SQL query such as the below to solve this:
SELECT c.customer_name, SUM(p.price) AS total_spending, MAX(p.price) AS most_expensive_product FROM customers c JOIN products p ON c.product_id = p.product_id GROUP BY c.customer_id, c.customer_name ORDER BY total_spending DESC;
This query works by joining the customers
and products
tables based on the product_id
field. It then groups the records by customer_id
and customer_name
. For each group (which represents a single customer), it sums up the price of the products they bought to find the total spending. It also finds the maximum product price within the group, which gives us the price of the most expensive product they bought. Finally, it orders the results by total spending in descending order.
Since joins come up routinely during SQL interviews, take a stab at this Snapchat SQL Interview question using JOINS:
Also read their "We are PVH" Statement to help you understand the brand even better!
The key to acing a PVH Corp SQL interview is to practice, practice, and then practice some more!
In addition to solving the above PVH Corp SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Microsoft, Google, Amazon, and tech startups.
Each SQL question has multiple hints, step-by-step solutions and crucially, there's an interactive SQL code editor so you can right in the browser run your SQL query answer and have it executed.
To prep for the PVH Corp SQL interview you can also be useful to practice SQL questions from other apparel companies like:
In case your SQL skills are weak, don't worry about going right into solving questions – go learn SQL with this free SQL tutorial.
This tutorial covers things like creating summary stats with GROUP BY and LAG window function – both of which come up routinely in PVH Corp SQL interviews.
In addition to SQL interview questions, the other topics to practice for the PVH Corp Data Science Interview include:
To prepare for PVH Corp Data Science interviews read the book Ace the Data Science Interview because it's got: