At Kinaxis, SQL is used all the damn time for managing supply chain databases for real-time analysis, and creating data-driven predictive models for logistics optimization. Because of this, Kinaxis almost always evaluates jobseekers on SQL query questions in interviews for Data Science, Data Engineering and Data Analytics jobs.
So, to help you practice for the Kinaxis SQL interview, this blog covers 10 Kinaxis SQL interview questions – can you answer each one?
Assuming that Kinaxis is a software company that has various products under its management and it gathers the reviews from users monthly. The data science team would like to analyze the user review data to find out the monthly average rating for each product.
Write a SQL query to retrieve the monthly average rating for each product. Assume that the reviews are stored in a table with the following schema:
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2020-11-01 | 50001 | 4 |
7802 | 265 | 2020-11-05 | 69852 | 5 |
5293 | 362 | 2020-12-12 | 50001 | 3 |
6352 | 192 | 2021-01-26 | 69852 | 1 |
4517 | 981 | 2021-02-05 | 69852 | 4 |
Output the average rating, rounded to two decimal places if the average isn't an integer, for each product arranged by month and year. The output should have three columns named , , and .
month_year | product_id | average_rating |
---|---|---|
2020-11 | 50001 | 4.00 |
2020-11 | 69852 | 5.00 |
2020-12 | 50001 | 3.00 |
2021-01 | 69852 | 1.00 |
2021-02 | 69852 | 4.00 |
In this query, we first truncate the submit_date to month-level precision using the function. We then group by this monthly value and the product_id.
The monthly average rating per product is computed by the aggregate function. We use to round the average rating to 2 decimal places.
Finally, we use to sort the result set first by month and then by product_id.
To solve a related window function SQL problem on DataLemur's free online SQL code editor, try this Amazon BI Engineer interview question:
Kinaxis is a company that provides supply chain management and sales and operation planning software. Let's assume you are tasked with designing a database for monitoring the performance of its supply chain networks. This will require tracking millions of products throughout various stages of the supply chain journey from numerous suppliers to numerous customers. One key success factor for Kinaxis is ensuring a high OTIF (On Time in Full) delivery rate, which is a measure of efficiency in logistics and supply chain operations.
Given this, your task is to identify the suppliers that have been consistently providing late deliveries. You have 2 tables, and .
order_id | supplier_id | product_id | expected_delivery_date | actual_delivery_date |
---|---|---|---|---|
1 | 200 | 800 | 03/01/2022 | 03/05/2022 |
2 | 200 | 650 | 03/02/2022 | 03/02/2022 |
3 | 325 | 975 | 03/05/2022 | 03/07/2022 |
4 | 200 | 800 | 03/06/2022 | 03/10/2022 |
5 | 325 | 975 | 03/10/2022 | 03/12/2022 |
supplier_id | name |
---|---|
200 | Laggy Logistics |
325 | Family Freight |
Your task is to write an SQL query to identify the suppliers with an OTIF rate less than 90%.
This SQL statement begins by creating a Common Table Expression (CTE) that groups orders by supplier_id and calculates the total number of orders per supplier and the number of on-time orders. The main query then filters for suppliers with an OTIF rate below 90%. The on-time orders count is calculated using a CASE statement in the CTE that checks if the actual delivery date is later than the expected delivery date. The OTIF rate is calculated as the ratio of on-time orders to total orders, multiplied by 100 to get a percentage.
When designing a database schema, a one-to-one relationship between two entities is characterized by each entity being related to a single instance of the other. An example of this is the relationship between a car and a license plate - each car has one license plate, and each license plate belongs to one car.
On the other hand, a one-to-many relationship is when one entity can be associated with multiple instances of the other entity. For example, a person can have multiple email addresses, but each email address only relates back to one person.
You are provided with a database containing information about Kinaxis's customers and the subscription services they have subscribed to. Your task is to write a SQL query that would filter customers who have subscribed to both 'Service A' and 'Service B' and whose subscription is still active. Assume the 'subscription_status' column uses "active" to denote active subscriptions and "inactive" for inactive subscriptions.
customer_id | customer_name | subscription_service | subscription_status |
---|---|---|---|
101 | John Doe | Service A | active |
102 | Jane Smith | Service B | active |
101 | John Doe | Service B | active |
103 | Eric Johnson | Service A | active |
104 | Samantha Taylor | Service B | inactive |
customer_id | customer_name |
---|---|
101 | John Doe |
This SQL query filters the 'customers' table using WHERE to target customers who have subscribed to either 'Service A' or 'Service B' whose subscriptions are currently 'Active'. With the HAVING clause, we ensure we are only considering those customers who have 2 distinctly different services, in this case, 'Service A' and 'Service B'. The GROUP BY clause ensures each customer is considered as a unique record for the count. Please remember to replace 'Service A' and 'Service B' with the exact service names from your database.
"The clause in SQL allows you to select records that are unique, eliminating duplicates.
For a tangible example, say you had a table of Kinaxis employees:
first_name | job_title |
---|---|
Akash | Data Analyst |
Brittany | Data Scientist |
Carlos | Data Engineer |
Diego | Data Engineer |
Eva | Data Analyst |
If you were doing an HR Analytics project and you wanted to get all the unique job titles that currently worked at the company, you would write the following SQL query:
The output would give you 3 distinct job titles at Kinaxis:
job_title |
---|
Data Analyst |
Data Scientist |
Data Engineer |
Kinaxis is interested in understanding how well their digital ads are converting to actual product views and subsequently to add-to-cart actions. Given two tables and , your task is to write a query to calculate the click-through conversion rate from ad click to viewing a product and from viewing a product to adding a product to the cart.
ad_click_id | user_id | click_date | product_id |
---|---|---|---|
1001 | 123 | 06/08/2021 | 50001 |
1202 | 265 | 06/10/2021 | 69852 |
1301 | 362 | 06/18/2021 | 50001 |
1004 | 192 | 07/26/2021 | 69852 |
1212 | 981 | 07/05/2021 | 69852 |
cart_add_id | user_id | add_date | product_id |
---|---|---|---|
2001 | 123 | 06/08/2021 | 50001 |
2202 | 362 | 06/18/2021 | 50001 |
2303 | 192 | 07/26/2021 | 69852 |
2020 | 981 | 07/05/2021 | 69852 |
2222 | 362 | 07/10/2021 | 50001 |
Here is a possible PostgreSQL query to solve this:
This query first computes the number of ad clicks and the number of adds to cart per product. Then it joins these two derived tables on product_id to calculate the click-through rate by dividing the number of cart adds by the number of ad clicks and multiplying by 100. To account for variable types, the counts are casted to float before division. The result will give you how effective each product's ads are at converting clicks into add-to-cart actions.
To solve another question about calculating rates, try this TikTok SQL question on DataLemur's interactive SQL code editor:
Joins in SQL allow you to combine data from different tables based on a shared key or set of keys.
Four JOIN types are available in SQL. For an example of each one, say you had sales data exported from Kinaxis's Salesforce CRM stored in a PostgreSQL database, and had access to two tables: and .
As part of Kinaxis' expansion project, management wants to identify potential customers from specific regions to tailor marketing efforts effectively. They want you to provide the customer details for those with email ids ending in '.uk' and '.ca', signifying likely residence in the United Kingdom and Canada, respectively.
customer_id | name | |
---|---|---|
5432 | John Doe | jdoe@example.uk |
3452 | Jane Doe | jane@example.com |
8723 | Elon Tusk | etusk@example.ca |
6542 | Bruce Wayne | bwayne@example.com |
2314 | Scott Harris | sharris@example.ca |
customer_id | name | |
---|---|---|
5432 | John Doe | jdoe@example.uk |
8723 | Elon Tusk | etusk@example.ca |
2314 | Scott Harris | sharris@example.ca |
The following SQL query should return the required data.
This query is searching the customers table and retrieving the customer_id, name, and email where the email field matches the patterns '.uk' and '.ca'. In the LIKE syntax, the '%' symbol is used as a wildcard to represent any possible characters appearing before '.uk' or '.ca' pattern, effectively selecting only those customers whose emails end with these extensions.
At Kinaxis, order data contains unit weight and quantity of each product ordered. We’d like to understand the variability of total weight per order. Please calculate the standard deviation of the total weight of all orders.
Please use the "orders" table as sample data:
order_id | product_id | unit_weight_kg | quantity |
---|---|---|---|
1101 | 201 | 2.3 | 5 |
1102 | 202 | 3.5 | 10 |
1103 | 203 | 2.1 | 7 |
1104 | 204 | 4.7 | 6 |
1105 | 201 | 2.3 | 8 |
In PostgreSQL, we can use the function to calculate the standard deviation.
In this answer, each product's total order weight (unit_weight_kg * quantity) is calculated in the subquery, along with the average total order weight across all orders. In the main query, the standard deviation is then calculated by using the and functions together to implement the formula for standard deviation. This gives us the variability of total weights across all orders.
To practice a very similar question try this interactive Alibaba Compressed Mean Question which is similar for calculating aggregate measures or this Amazon Highest-Grossing Items Question which is similar for handling items and quantities.
In SQL, the operator combines the result of two or more SELECT statements into a single result set. Note that each statement within the UNION must have the same number of columns and the columns must have similar data types. The operator removes duplicates from the final result set. If you want to include duplicates in the final result set, you can use the operator instead.
Here's an example of using the UNION operator to combine the results of two SELECT statements that retrieve data from tables of Kinaxis's Facebook ads and their Google ads:
This statement would return a combined result set of both Facebook and Google ads that have a spend greater than $500.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Kinaxis SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the earlier Kinaxis SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Google, Facebook, Microsoft and Amazon.
Each exercise has multiple hints, detailed solutions and best of all, there is an online SQL code editor so you can right in the browser run your query and have it graded.
To prep for the Kinaxis SQL interview it is also useful to practice interview questions from other tech companies like:
However, if your SQL foundations are weak, forget about diving straight into solving questions – refresh your SQL knowledge with this SQL interview tutorial.
This tutorial covers things like filtering data with WHERE and CTE vs. Subquery – both of these come up frequently in Kinaxis SQL interviews.
In addition to SQL interview questions, the other types of questions to prepare for the Kinaxis Data Science Interview are:
The best way to prepare for Kinaxis Data Science interviews is by reading Ace the Data Science Interview. The book's got: