At KPIT Technologies, SQL is used all the damn time for analyzing automotive data patterns and for real-time vehicle system diagnostic analysis. Unsurprisingly this is why KPIT Technologies typically asks SQL problems in interviews for Data Science, Data Engineering and Data Analytics jobs.
Thus, to help you study for the KPIT Technologies SQL interview, we'll cover 11 KPIT Technologies SQL interview questions – able to answer them all?
KPIT Technologies is a global technology company providing IT Consulting and Product Engineering solutions and services. Assume, KPIT wants to track its top-paying customers who have made the highest number of purchases in the last year. The company uses the following two databases:
customer_id | first_name | last_name |
---|---|---|
1 | John | Doe |
2 | Jane | Smith |
3 | Mark | Zuckerberg |
4 | Elon | Musk |
5 | Bill | Gates |
order_id | customer_id | order_date | product_id | order_value |
---|---|---|---|---|
101 | 1 | 06/08/2021 00:00:00 | 5001 | 1000 |
202 | 2 | 06/10/2021 00:00:00 | 6985 | 200 |
303 | 1 | 06/18/2021 00:00:00 | 5001 | 3000 |
404 | 2 | 07/26/2021 00:00:00 | 6985 | 4500 |
505 | 1 | 07/05/2021 00:00:00 | 6985 | 1200 |
In this scenario, the question might be:
Write a SQL query to find the top 3 customers who have made the most number of purchases in the last year. Your query should return the customer's complete name and total order value.
Here is how you might solve this:
This query joins the 'orders' and 'customers' tables on the 'customer_id', filters for orders in the last year, groups by 'customer_id', and then orders by the total order value in descending order. The 'LIMIT 3' at the end ensures that only the top 3 customers are returned.
To solve a related customer analytics question on DataLemur's free interactive SQL code editor, try this recently asked Microsoft SQL interview question:
KPIT Technologies is a global technology company specializing in providing IT Consulting and Product Engineering solutions and services to key focus industries - Automotive & Transportation, Consumer & Industrial Goods, Energy & Resources, High Tech, Life Sciences, and Utilities.
Let's consider a scenario where KPIT wants to analyze their monthly revenue per product over the past year. As a Data Analyst, you are tasked to write a SQL query to calculate the average monthly revenue per product using the table. The table has the following data,
sale_id | product_id | sale_date | unit_price | quantity |
---|---|---|---|---|
1 | 1001 | 2021-12-21 | 1000.0 | 2 |
2 | 1002 | 2021-11-13 | 2000.0 | 5 |
3 | 1003 | 2021-11-29 | 3000.0 | 3 |
4 | 1001 | 2022-01-15 | 1000.0 | 4 |
5 | 1002 | 2022-02-12 | 2000.0 | 6 |
Using PostgreSQL, the SQL query will be:
This SQL query first extracts the year and month from the in the table. It then calculates the total revenue generated from each sale by multiplying with . The function is used along with to partition the data by product, year and month and then the average revenue is calculated. The clause is then used to sort the result set by Year, Month and product_id.
To solve a related window function SQL problem on DataLemur's free interactive coding environment, solve this Amazon BI Engineer interview question:
To discover records in one table that are not present in another, you can utilize a and filter out any values in the right-side table.
For example, say you had a table of KPIT Technologies customers and a 2nd table of all purchases made with KPIT Technologies. To find all customers who did not make a purchase, you'd use the following
This query fetches all rows from the customers table, along with any rows that match in the purchases table. If there is no matching row in the purchases table, values will be returned for all of the right table's columns. The clause then filters out any rows where the purchases.id column is NULL, leaving only customers who have not made a purchase.
KPIT Technologies is handling multiple software projects. Each software project is managed by a project manager and has multiple software developers working on it. Each software developer can only work on one project. Each software project is built up of many tasks, which are assigned to developers by the project manager. The task is defined by its description, estimated finish time (in hours) and its actual finish time.
Design a PostgreSQL database to represent this scenario and answer the following question using the database:
Calculate the total estimated and actual time (in hours) taken for each project.
project_id | project_name | project_manager |
---|---|---|
1001 | Project A | Manager 1 |
1002 | Project B | Manager 2 |
1003 | Project C | Manager 3 |
developer_id | project_id | developer_name |
---|---|---|
2001 | 1001 | Dev 1 |
2002 | 1002 | Dev 2 |
2003 | 1003 | Dev 3 |
task_id | developer_id | description | estimated_time | actual_time |
---|---|---|---|---|
3001 | 2001 | Task 1 | 20 | 30 |
3002 | 2001 | Task 2 | 40 | 45 |
3003 | 2002 | Task 3 | 50 | 60 |
3004 | 2003 | Task 4 | 60 | 55 |
The query first joins the , and tables by their respective ids. It then calculates the total estimated and actual time for each project using the function, grouped by project id. This will result in a table listing each project along with its total estimated and actual time.
A primary key is a column (or set of columns) in a table that uniquely identifies each row in the table. It cannot contain null values and must be unique across all rows in the table.
A foreign key is a column (or set of columns) in a table that references the primary key of another table. It is used to establish a relationship between the two tables. A foreign key can contain null values, and multiple rows in the referencing table can reference the same row in the referenced table.
For example, consider a database with two tables: and . The KPIT Technologies customers table might have a primary key column called , while the KPIT Technologies orders table might have a foreign key column called that references the column in the table. This establishes a relationship between the two tables, such that each row in the orders table corresponds to a specific KPIT Technologies customer.
Given the customer data in a relational database at KPIT Technologies, write a SQL query to filter out the records of customers that meet the following conditions:
Sample customer data:
customer_id | name | city | preferred_contact | last_transaction_date |
---|---|---|---|---|
1452 | John Doe | Pune | 09/12/2022 | |
6543 | Jane Smith | Mumbai | Phone | 08/25/2022 |
2563 | Rahul Sharma | Pune | 09/14/2022 | |
8124 | Sneha Patel | Delhi | 09/01/2022 | |
4987 | Aman Verma | Pune | Phone | 09/10/2022 |
customer_id | name |
---|---|
1452 | John Doe |
2563 | Rahul Sharma |
Here's what the above SQL query does:
First, it selects the and columns from the table.
Then, it filters the results to include only rows where the is Pune, the method is Email, and the is within the last 7 days.
This results in a list of customers who live in Pune, prefer to be contacted by email, and have engaged in a transaction within the last week. This filtered list can now be used for various targeted business strategies such as email campaigns or special offers.
A primary key is a column or group of columns that uniquely identifies a row in a table. For example, say you had a database of KPIT Technologies marketing campaigns data:
In this KPIT Technologies example, the CampaignID column is the primary key of the MarketingCampaigns table. The constraint ensures that no two rows have the same CampaignID. This helps to maintain the integrity of the data in the table by preventing duplicate rows.
KPIT Technologies is advertising its new training programs on various platforms. They are tracking the number of views each ad receives, the number of clicks each ad receives, and the number of enrollments coming from each ad. Your task is to write a SQL query that calculates the click-through rate (CTR) and the conversion rate for each ad. The CTR is calculated as the total number of clicks divided by the total number of views, while the conversion rate is calculated as the total number of enrollments divided by the total number of clicks.
The given datasets are: :
ad_id | views | clicks |
---|---|---|
1 | 2500 | 200 |
2 | 1800 | 150 |
3 | 2200 | 180 |
4 | 3000 | 250 |
:
ad_id | enrollments |
---|---|
1 | 50 |
2 | 45 |
3 | 55 |
4 | 60 |
Please note that all enrollments are coming from clicks, i.e., people who have viewed the ad, clicked it, and then enrolled in the training program.
This query joins the and tables using the column, and then calculates the click-through and conversion rates. The cast to FLOAT is used to perform floating point division (to avoid integer division) and the multiplication by 100 is used to represent the rates as percentages.
To solve a related SQL problem on DataLemur's free interactive SQL code editor, attempt this SQL interview question asked by Facebook:
KPIT Technologies is a global technology company specializing in IT consulting and product engineering solutions. Suppose we have a service offering where KPIT resources (for example, software, servers, or hardware) are rented out to their clients per hour. We want to know the average hours utilized per day for each resource for the past month.
Assume there is a table 'resource_usage' that keeps track of the resource usage. Each record in the table representing the usage of a specific resource by a specific client at a specific hour.
The 'resource_usage' table has the following structure:
usage_id | client_id | usage_date | resource_id | usage_hours |
---|---|---|---|---|
101 | A23 | 07/10/2022 00:00:00 | R102 | 3 |
102 | B39 | 07/10/2022 00:00:00 | R105 | 4 |
103 | C84 | 07/11/2022 00:00:00 | R102 | 2 |
104 | D29 | 07/11/2022 00:00:00 | R105 | 3 |
105 | E19 | 07/12/2022 00:00:00 | R102 | 1 |
The question is to find the average daily usage for each resource for the past month.
From the given data, the expected output table would look like this:
day | resource | avg_usage_hours |
---|---|---|
10 | R102 | 3.00 |
10 | R105 | 4.00 |
11 | R102 | 2.00 |
11 | R105 | 3.00 |
12 | R102 | 1.00 |
In this query, the function is used to get the day part from the usage_date, is used to get the average usage_hours, and is used to group the results by day and resource. All this is filtered only for the data of the past month.
In a database, an index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the index data structure.
There are several types of indexes that can be used in a database:
As a data analyst at KPIT Technologies, your task is to filter the company's customer database to find all clients whose names begin with the letter 'A' and are located in 'California'. Use the SQL keyword LIKE for this task.
Considering KPIT's customers database:
customer_id | name | location | phone | |
---|---|---|---|---|
1001 | Alex Peterson | California | alex.p@example.com | (123) 456-7890 |
1002 | Bruce Watson | New York | bruce.w@example.com | (234) 567-8901 |
1003 | Charlie Johnson | Florida | charlie.j@example.com | (345) 678-9012 |
1004 | Anna Snow | California | anna.s@example.com | (456) 789-0123 |
1005 | Samuel Adams | Texas | samuel.a@example.com | (567) 890-1234 |
Now, write a PostgreSQL query to filter this data.
This query makes use of the LIKE operator in SQL to filter for clients whose names start with the letter 'A'. The '%' in the LIKE clause is a wildcard character that matches any sequence of characters. The AND operator is used to satisfy both conditions: the name starting with 'A' and the location being 'California'.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the KPIT Technologies SQL interview is to solve as many practice SQL interview questions as you can! In addition to solving the above KPIT Technologies SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like FAANG (FB, Apple, Amazon, Netflix, Google).
Each exercise has hints to guide you, full answers and most importantly, there is an online SQL coding environment so you can easily right in the browser your SQL query answer and have it graded.
To prep for the KPIT Technologies SQL interview it is also useful to solve interview questions from other tech companies like:
However, if your SQL skills are weak, don't worry about jumping right into solving questions – refresh your SQL knowledge with this SQL tutorial for Data Scientists & Analysts.
This tutorial covers things like INTERCEPT/EXCEPT and working with string/text data – both of which come up frequently in KPIT Technologies SQL assessments.
In addition to SQL query questions, the other types of problems to practice for the KPIT Technologies Data Science Interview are:
The best way to prepare for KPIT Technologies Data Science interviews is by reading Ace the Data Science Interview. The book's got: