At Manhattan Associates, SQL crucial for analyzing large supply chain data sets for optimization and creating performance-scaled data models to support strategic decision-making in logistics. That's why Manhattan Associates asks SQLinterview problems for Data Science, Data Engineering and Data Analytics jobs.
As such, to help you ace the Manhattan Associates SQL interview, we've curated 11 Manhattan Associates SQL interview questions – can you solve them?
Manhattan Associates is keen on identifying their high-value customers in order to optimize their customer relationship efforts. "High-Value" customers, for the scope of this problem, are defined as customers who made purchases of total value over $100,000 in the last 12 months.
As an interviewee, write an SQL query that will return a list of all high-value customers, sorted by the total value of their purchases in descending order.
Consider the following schema for Manhattan Associates' database:
customer_id | first_name | last_name |
---|---|---|
101 | John | Doe |
102 | Jane | Smith |
103 | Bob | Ross |
104 | Alice | Liu |
105 | Charlie | Brown |
order_id | customer_id | purchase_date | total_value |
---|---|---|---|
1001 | 101 | 2021-06-15 00:00:00 | 15000 |
1002 | 101 | 2021-10-28 00:00:00 | 30000 |
1003 | 102 | 2022-01-02 00:00:00 | 45000 |
1004 | 103 | 2022-05-23 00:00:00 | 12000 |
1005 | 104 | 2022-08-01 00:00:00 | 80000 |
This query first joins the "customer" and "orders" tables on the "customer_id" column. It then filters the result set to only include records where the purchase was made less than a year ago. Following this, it groups the result set by customer and calculates the total amount spent by each customer. The "HAVING" clause is used to filter the grouped result set to only show customers who spent more than $100,000. Finally, the result set is ordered by the "total_spent" column in descending order to show the highest spending customers first.
To practice a similar power-user data analysis problem question on DataLemur's free interactive SQL code editor, try this Microsoft Teams Power User SQL Interview Question:
Manhattan Associates often requires analysis on their sales data. In the context of this question, we would like to analyze the average quantity sold per product on a monthly basis in the year 2021.
Consider the following dataset, housed in a hypothetical table, which contains individual sales records for Manhattan Associates. Each record details the , , the sold in that transaction, and the .
sale_id | product_id | sale_date | quantity |
---|---|---|---|
101 | P1001 | 2021-01-10 | 14 |
102 | P1002 | 2021-01-15 | 50 |
103 | P1001 | 2021-02-18 | 21 |
104 | P1003 | 2021-03-22 | 35 |
105 | P1002 | 2021-02-28 | 45 |
106 | P1001 | 2021-03-20 | 19 |
107 | P1003 | 2021-01-05 | 25 |
108 | P1001 | 2021-02-15 | 17 |
109 | P1002 | 2021-01-19 | 33 |
The SQL interview question would be: "Write a SQL query that returns the average monthly quantity sold for each product in the year 2021. The output should have columns for the month, product ID, and the average quantity sold for that product during that month."
Your output should look something like this:
month | product_id | avg_quantity |
---|---|---|
1 | P1001 | 14.00 |
1 | P1002 | 41.50 |
1 | P1003 | 25.00 |
2 | P1001 | 19.00 |
2 | P1002 | 45.00 |
3 | P1001 | 19.00 |
3 | P1003 | 35.00 |
Since we're using PostgreSQL, you can make use of the function to get the month from the . Here's a sample SQL query to get the solution:
This query works by first filtering down to sales records just for 2021. It then groups the results by and the month of the sale. Within each group, it calculates the average quantity sold. Finally, it orders the result by and then , so that the sales for each product are grouped together.
Pro Tip: Window functions are a popular SQL interview topic, so practice all the window function problems on DataLemur
An index in a database is a data structure that helps to quickly find and access specific records in a table.
For example, if you had a database of Manhattan Associates customers, you could create a primary index on the column.
Having a primary index on the column can speed up performance in several ways. For example, if you want to retrieve a specific customer record based on their , the database can use the primary index to quickly locate and retrieve the desired record. The primary index acts like a map, allowing the database to quickly find the location of the desired record without having to search through the entire table.
Additionally, a primary index can also be used to enforce the uniqueness of the column, ensuring that no duplicate values are inserted into the table. This can help to prevent errors and maintain the integrity of the data in the table.
Manhattan Associates specializes in supply chain and omnichannel commerce technology. Let's say, they want to identify customers who have purchased items worth more than $1000 in total, and they made these purchases either in their Manhattan office or through their e-commerce channel. Write an SQL query to filter and return these customers.
We have a database, which stores info of each sale made by the company.
sale_id | customer_id | purchase_date | purchase_value | purchase_channel |
---|---|---|---|---|
501 | 001 | 2022-06-18 | 560.00 | E-commerce |
502 | 002 | 2022-04-25 | 250.00 | Manhattan Office |
503 | 001 | 2022-05-11 | 475.00 | E-commerce |
504 | 003 | 2022-07-22 | 800.00 | Third-party Vendor |
505 | 002 | 2022-03-05 | 800.00 | Manhattan Office |
Your task is to write a query to filter out the customers who have purchased total worth over $1000 either from Manhattan Office or via E-Commerce.
customer_id | total_purchase | purchase_channel |
---|---|---|
001 | 1035.00 | E-commerce |
002 | 1050.00 | Manhattan Office |
This query filters the table for purchases made via 'E-commerce' or 'Manhattan Office' as specified in the condition. It sums up the purchase values for each customer-channel combination due to clause, then filters these aggregates where total purchase is more than $1000 using condition. In the output, we get the customer_id along with their total purchase amount and the channel they made these purchases from.
While both and are used to rank rows, the key difference is in how they deal with ties.
RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the 2nd row in the tie, and a rank of 4 to the the 3rd tie.
DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.
Suppose we had data on how many deals different salespeople at Manhattan Associates:
To rank these salespeople, we could execute the following query:
The result of this query would be:
name | deals_closed | rank | dense_rank |
---|---|---|---|
Akash | 50 | 1 | 1 |
Brittany | 50 | 2 | 1 |
Carlos | 40 | 3 | 2 |
Dave | 40 | 4 | 3 |
Eve | 30 | 5 | 3 |
Farhad | 10 | 6 | 4 |
As you can see, the function assigns a rank of 1 to the first row (Akash), a rank of 2 to the second row (Brittany), and so on. In contrast, the , gives both Akash and Brittany a rank of 1, since they are tied, then moves on to rank 2 for Carlos.
Manhattan Associates, a technology company providing supply chain management software, amongst other services, would be interested in evaluating their customer service effectiveness. Could you provide the SQL query to find out the average time taken to resolve tickets submitted by customers to their support team?
Assume that we have the following tables and columns:
ticket_id | submit_date | close_date | customer_id |
---|---|---|---|
8741 | 01/04/2022 00:00:00 | 03/04/2022 00:00:00 | 234 |
9422 | 01/04/2022 00:00:00 | 02/04/2022 00:00:00 | 654 |
7543 | 02/04/2022 00:00:00 | 02/04/2022 00:00:00 | 543 |
8629 | 02/04/2022 00:00:00 | 03/04/2022 00:00:00 | 738 |
Based on the provided input, an SQL query to find out the average resolution time can be as follows:
This SQL query calculates the average resolution time of tickets in days. The function gets the difference between the and in seconds using EPOCH, then converts this to days by dividing by (60 * 60 * 24). The function then calculates the average of these day values. If you want the resolution time in other units like hours or minutes, you can modify the divisor accordingly.
To practice a very similar question try this interactive Facebook Average Post Hiatus (Part 1) Question which is similar for calculating duration between dates or this Amazon Server Utilization Time Question which is similar for estimating total time metrics.
A cross-join, also referred to as a cartesian join, is a type of JOIN that returns the cross-product of two tables. In a cross-join, each row from the first table is paired with every row from the second table, resulting in a new table with a row for each possible combination of rows from the two input tables.
For example, say you worked on the Marketing Analytics team at Manhattan Associates, and were tasked to understand what advertising copy (text) goes best with what advertising creative (the photo that gets used along with the text copy).
Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:
You could this query to generate all possible combinations of ad copy and ad creative to help you create more effective ads for Manhattan Associates. Just be careful: if you had 1,000 rows of ad copy and 1,000 different image creatives, the resulting cross-join would have 1 million rows! As a result, it's important to use cross-joins judiciously, and make sure that you're not generating more data than you need to.
You are working as a data analyst for Manhattan Associates, a multinational technology company that provides supply chain management software. One day, you are tasked to analyse and determine the click-through rates of their products.
Given 2 tables, and , which record every time a user clicks on an advertisement for a product, and every time a user subsequently adds that product to their cart, write a query to find the click-through conversion rate for each product.
The above PostgreSQL query returns the click-through conversion rate which is the ratio of the number of times a product is added to the cart to the number of times an advertisement for the product is clicked for each product. This is achieved by joining two subqueries on product_id. The first subquery, , calculates the count of ad clicks on each product, and the second subquery, , calculates the count of add-to-cart actions on each product.
To solve a similar problem on DataLemur's free online SQL coding environment, solve this Facebook SQL Interview question:
You are asked to analyze a sales database of Manhattan Associates. You have two tables, and . The table contains orders placed by customers, including the of the ordered product and the quantity ordered. The table lists all the available products, their , , and the that supplies the product.
Write a SQL query that joins these two tables and calculates the total quantities of products ordered from a given vendor ().
The expected output should have the , and the of all ordered for each product from that vendor.
order_id | product_id | quantity |
---|---|---|
9001 | 10001 | 10 |
9002 | 10002 | 15 |
9003 | 10001 | 20 |
9004 | 10003 | 10 |
9005 | 10003 | 5 |
product_id | product_name | vendor_id |
---|---|---|
10001 | Apples | 123 |
10002 | Bananas | 123 |
10003 | Cherries | 456 |
vendor_id | product_name | total_quantity |
---|---|---|
123 | Apples | 30 |
123 | Bananas | 15 |
In this query we are joining the and table on the . We filter the results to only include products from the vendor with 123. Then we use the clause to group the results by and and simultaneously summing the column to find the total quantity of products ordered from a specific vendor.
Because joins come up frequently during SQL interviews, try this interactive Snapchat Join SQL question:
When using , only rows that are identical in both sets will be returned.
For a concrete example, say you were a Data Analyst supporting the Sales Analytics team at Manhattan Associates, and data on potential sales leads lived in both Salesforce and Hubspot CRMs. To write a query to analyze leads created before 2023 started, that show up in BOTH CRMs, you would use the command:
The sales department of Manhattan Associates tracks the sales of each product on a quarterly basis and stores it in a database. You are asked to analyze the variance related to the average sales for each category during the last year. Variances are often analyzed by looking at the square of the difference from mean (average). Create a SQL query that calculates the squared distance of each product's quarterly sales from the average sales of its category. For this, extract the category, product_id, quarter, actual sales, average category sales for the year, and the squared difference between actual and average sales.
Be sure to use the following math functions in your SQL query: AVG(), POWER(), ROUND().
product_id | category | quarter | sales |
---|---|---|---|
1001 | Electronics | 1 | 1300 |
1002 | Electronics | 1 | 1500 |
1003 | Furniture | 1 | 3200 |
1001 | Electronics | 2 | 1200 |
1002 | Electronics | 2 | 1100 |
1003 | Furniture | 2 | 2900 |
1001 | Electronics | 3 | 1400 |
1002 | Electronics | 3 | 1300 |
1003 | Furniture | 3 | 3100 |
1001 | Electronics | 4 | 1500 |
1002 | Electronics | 4 | 1700 |
1003 | Furniture | 4 | 2800 |
This query first calculates the average sales of each category for each quarter in the subquery called . Then it joins this table with the original table so that each row has the sales of the product and the average sales of the category for the corresponding quarter. The squared distance is calculated by subtracting the sales of the product from the average sales, squaring it using the POWER() function, and rounding to 2 decimal places.
Note: In real world, you would also consider number of products in each category and the number of quarters to keep this fair.
To practice a very similar question try this interactive Wayfair Y-on-Y Growth Rate Question which is similar for handling sales data or this Amazon Highest-Grossing Items Question which is similar for analysing sales per product.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Manhattan Associates SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the above Manhattan Associates SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Google, Microsoft and Silicon Valley startups.
Each exercise has hints to guide you, detailed solutions and crucially, there is an online SQL code editor so you can easily right in the browser your SQL query and have it graded.
To prep for the Manhattan Associates SQL interview you can also be wise to practice interview questions from other tech companies like:
However, if your SQL query skills are weak, forget about going right into solving questions – improve your SQL foundations with this free SQL for Data Analytics course.
This tutorial covers SQL concepts such as filtering groups with HAVING and manipulating date/time data – both of which come up frequently during Manhattan Associates SQL assessments.
In addition to SQL query questions, the other question categories to practice for the Manhattan Associates Data Science Interview are:
The best way to prepare for Manhattan Associates Data Science interviews is by reading Ace the Data Science Interview. The book's got: