At Descartes Systems, SQL is used day-to-day for analyzing massive datasets to improve logistics efficiency and manage supply chain operations. So, it shouldn't surprise you that Descartes typically asks SQL questions during interviews for Data Science, Data Engineering and Data Analytics jobs.
So, to help you ace the Descartes SQL interview, this blog covers 8 Descartes Systems SQL interview questions – can you answer each one?
In Descartes, we have a table which has the customer reviews containing the information of the product reviewed, the review's ID, user's ID, the number of stars given from 1-5, and the date when the review was submitted.
Write a SQL query which will calculate the average number of stars each product received each month. Ranking should be based on the year and month of submit_date.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-01-12 | 50001 | 4 |
7802 | 265 | 2022-02-15 | 50001 | 5 |
5293 | 362 | 2022-02-22 | 50001 | 3 |
6352 | 192 | 2022-03-26 | 69852 | 3 |
4517 | 981 | 2022-03-5 | 69852 | 2 |
month | product_id | avg_stars |
---|---|---|
2022-01-01 | 50001 | 4.00 |
2022-02-01 | 50001 | 4.00 |
2022-03-01 | 69852 | 2.50 |
This SQL query calculates monthly average rating for each product. It first truncates the date by month, giving the first day of the month for each date. Then it forms groups by these truncated dates and product ids to calculate average stars. The output is ordered by month and product id.
Pro Tip: Window functions are a popular SQL interview topic, so practice every window function problem on DataLemur
Descartes is well-known for its logistics and supply chain solutions. One main function of the software is to track orders throughout the supply chain system. Your task is to model a database structure for this order system.
To adhere to the business process, the following entities are crucial: Customers (who places an order), Products (which is ordered) and Suppliers (who supply the Products). An Order is placed by a Customer for one or several Product(s) which are supplied by a Supplier(s).
Consider following attributes for the entities:
Orders are placed on a specific date by the customers and each order consists of different products with their quantities.
Create a database table with attributes: order_id, customer_id, order_date, product_id, quantity.
order_id | customer_id | order_date | product_id | quantity |
---|---|---|---|---|
1 | 101 | 2022-05-23 | 5001 | 2 |
2 | 102 | 2022-05-24 | 5002 | 1 |
3 | 103 | 2022-05-25 | 5001 | 3 |
4 | 101 | 2022-05-26 | 5003 | 1 |
5 | 104 | 2022-05-27 | 5002 | 4 |
Write a PostgreSQL query to retrieve the total quantity of each product ordered by each customer.
This query groups orders by customer and product to get the total quantity of each product ordered by each customer. By using the aggregate function on the column and grouping by and , we can sum up the total quantity for each product for each customer. The clause is then used to sort the results by and .
The / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.
Note that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle (but don't stress about knowing which DBMS supports what exact commands since the interviewers at Descartes should be lenient!).
For a tangible example in PostgreSQL, suppose you were doing an HR Analytics project for Descartes, and had access to Descartes's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables. You could use operator to find all contractors who never were a employee using this query:
As a data analyst at Descartes, a software technology company that provides on-demand, software-as-a-service solutions, you are asked to study users' software interaction times. To evaluate the efficiency of the recently updated interface, you've been tasked to find the average time spent per user session.
Consider the following data:
session_id | user_id | start_time | end_time |
---|---|---|---|
101 | 123 | 06/10/2022 10:00:00 | 06/10/2022 10:30:00 |
102 | 123 | 06/10/2022 11:00:00 | 06/10/2022 11:45:00 |
103 | 456 | 06/12/2022 13:00:00 | 06/12/2022 13:15:00 |
104 | 456 | 06/12/2022 15:00:00 | 06/12/2022 15:35:00 |
105 | 123 | 06/13/2022 09:00:00 | 06/13/2022 09:45:00 |
In this query, we calculate the time spent per session by subtracting the start_time from the end_time. We then use the AVG function to find the average time spent per session for each user. The time difference between end_time and start_time in PostgreSQL yields an INTERVAL type, hence we use the EXTRACT function to convert this interval to seconds (EPOCH), then divide by 60 to get minutes. We round this value to two decimal places for a more readable format.
The 2 most similar questions are:
"Sending vs. Opening Snaps": This problem is similar as it involves calculating time spent on activities, like the task of finding the average time spent per session.
"Server Utilization Time": This problem also involves calculating total time for a particular activity. This reflects similar operations to find the average time spent per session.
Here is the markdown content:
To practice a very similar question try this interactive Snapchat Sending vs. Opening Snaps Question which is similar for calculating time spent on activities or this Amazon Server Utilization Time Question which is similar for calculating total time for a particular activity.
In SQL, a value is NOT the same as a zero or a blank space. A NULL value represents the absence of a value or a missing value, whereas both zero and blank space are actually values.
To handle nulls, the function is often used to check for a null, and if there is a null, you can assign a default value.
Descartes, a SaaS (Software as a Service) company has recently launched a marketing campaign. They are primarily interested in understanding the click-through rates (CTR) of their ads leading users to add products to a cart. Your task is to determine the CTR which is calculated as the number of ‘add to cart’ actions divided by the number of ad views, multiplied by 100 to get a percentage. Provide the result for each ad campaign.
view_id | user_id | view_date | ad_id |
---|---|---|---|
105 | 3123 | 06/08/2023 | 501 |
219 | 365 | 06/10/2023 | 502 |
332 | 836 | 06/18/2023 | 501 |
417 | 1923 | 07/26/2023 | 502 |
541 | 9813 | 07/05/2023 | 501 |
action_id | user_id | cart_date | ad_id |
---|---|---|---|
1205 | 3123 | 06/08/2023 | 501 |
9032 | 5265 | 06/18/2023 | 502 |
6923 | 836 | 06/10/2023 | 501 |
8352 | 924 | 07/05/2023 | 502 |
4717 | 1081 | 07/12/2023 | 501 |
This query calculates the CTR by joining and tables on and . It then calculates the ratio of how many of ad views resulted in an ‘add to cart’ action. This ratio is multiplied by 100 to get a percentage, giving us the click-through rate for each ad campaign.
To practice a similar problem on DataLemur's free interactive coding environment, attempt this Facebook SQL Interview question:
The clause is used to filter the groups created by the clause. It's similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.
For example, say you were analyzing salaries for analytics employees at Descartes:
This query retrieves the total salary for each Analytics department at Descartes and groups the rows by the specific department (i.e. "Marketing Analytics", "Business Analytics", "Sales Analytics" teams).
The clause then filters the groups to include only Descartes departments where the total salary is greater than $1 million
Descartes is a technology company that provides logistics and supply chain management solutions. For this interview question, imagine that you are provided with a table that records all shipments made by the company. The table contains a , , , and .
Your task is to write a SQL query that calculates the average delivery time (in days) for each month.
shipment_id | order_id | product_id | ship_date | delivery_date |
---|---|---|---|---|
1001 | 101 | 501 | 2022-06-01 | 2022-06-05 |
1002 | 102 | 502 | 2022-06-03 | 2022-06-07 |
1003 | 103 | 503 | 2022-06-05 | 2022-06-10 |
1004 | 104 | 502 | 2022-07-01 | 2022-07-04 |
1005 | 105 | 501 | 2022-07-03 | 2022-07-08 |
month | average_delivery_time |
---|---|
06 | 4.67 |
07 | 4.50 |
In this query, we first extract the month from the using the function. Then, we calculate the average delivery time by subtracting the from the and using the function to convert the result into days. Finally, we group by the month and order the result by month.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Descartes SQL interview is to solve as many practice SQL interview questions as you can! Besides solving the above Descartes SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Google, Uber, and Microsoft.
Each DataLemur SQL question has multiple hints, fully explained answers along with a discussion board to see how others solved it and most importantly, there's an online SQL code editor so you can easily right in the browser your query and have it graded.
To prep for the Descartes SQL interview it is also useful to solve SQL questions from other tech companies like:
In case your SQL coding skills are weak, don't worry about jumping right into solving questions – refresh your SQL knowledge with this SQL interview tutorial.
This tutorial covers things like inner vs. outer JOIN and CASE/WHEN/ELSE statements – both of which pop up routinely during SQL job interviews at Descartes.
In addition to SQL query questions, the other types of problems to prepare for the Descartes Data Science Interview are:
The best way to prepare for Descartes Data Science interviews is by reading Ace the Data Science Interview. The book's got: