At Endava, SQL is crucial for extracting and manipulating data from databases across the multitude of digital transformation and software consulting clients Endava has. Unsurprisingly this is why Endava almost always evaluates jobseekers on SQL problems during interviews for Data Analytics, Data Science, and Data Engineering jobs.
So, to help you prepare for the Endava SQL interview, we'll cover 11 Endava SQL interview questions – able to answer them all?
Endava has an IT solutions shop where customers can purchase software licenses and other digital solutions. You are provided with two tables: 'purchase_records' and 'customers'. The 'purchase_records' table contains the record of all the purchases made by customers and the 'customers' table contains details about each customer.
The 'purchase_records' table has the following columns:
The 'customers' table has the following columns:
A 'whale' customer is defined as a customer whose total purchases exceed a certain threshold. For this case, the threshold is fixed at $5000.
Write a SQL query to find the 'whale' customers i.e., customers whose total purchases are more than $5000.
purchase_id | customer_id | purchase_date | product_id | amount_paid |
---|---|---|---|---|
1 | 23 | 04/10/2021 | 799 | 745 |
2 | 42 | 05/12/2021 | 657 | 5065 |
3 | 23 | 06/13/2021 | 219 | 750 |
4 | 67 | 07/20/2021 | 422 | 3256 |
5 | 23 | 07/26/2021 | 905 | 1507 |
customer_id | customer_name | sign_up_date |
---|---|---|
23 | John Doe | 01/25/2021 |
42 | Jane Smith | 11/11/2020 |
67 | James Bond | 05/01/2021 |
This query joins the 'customers' and 'purchase_records' tables on the common 'customer_id' field. It then groups the records by 'customer_name' and calculates the total amount paid by each customer. The HAVING clause then filters out only the customers who have paid more than $5000, thus identifying the 'whale' customers.
To solve a similar VIP customer analysis question on DataLemur's free interactive SQL code editor, try this Microsoft Azure Cloud SQL Interview Question:
Endava maintains a database for product usage by its customers. You are asked to analyze the number of active users for each product on a monthly basis.
You have the below dataset :
event_id | user_id | event_date | product_id |
---|---|---|---|
1 | 110 | 2022-06-01 | 101 |
2 | 120 | 2022-06-01 | 102 |
3 | 110 | 2022-06-01 | 102 |
4 | 130 | 2022-06-02 | 103 |
5 | 110 | 2022-07-01 | 101 |
6 | 120 | 2022-07-02 | 101 |
7 | 130 | 2022-07-01 | 103 |
8 | 140 | 2022-07-03 | 102 |
9 | 120 | 2022-07-02 | 101 |
10 | 120 | 2022-07-03 | 102 |
mth | product | active_users |
---|---|---|
6 | 101 | 1 |
6 | 102 | 2 |
6 | 103 | 1 |
7 | 101 | 2 |
7 | 102 | 2 |
7 | 103 | 1 |
The "active users" means unique users who have interacted with a product within a specific month.
You can solve this problem by using SQL window function to split the dataset by month and product, and then count distinct users:
This query will return the active users in each month for each product. The COUNT(DISTINCT user_id) OVER(PARTITION BY DATE_PART('month', event_date), product_id) will give us the unique count of users who interacted with a product per month.
Pro Tip: Window functions are a frequent SQL interview topic, so practice every window function problem on DataLemur
The clause works similarly to the clause, but it is used to filter the groups of rows created by the clause rather than the rows of the table themselves.
For example, say you were analyzing Endava sales data:
This query retrieves the total sales for all products in each region, and uses the clause to only sales made after January 1, 2023. The rows are then grouped by region and the clause filters the groups to include only those with total sales greater than $500k.
As a global IT consultancy, Endava manages numerous projects at a time. Each employee logs their hours spent per project in a timesheet system. The management wants to analyze this data to understand project resource usage. They are interested in two specific queries:
Design the necessary tables for this scenario (You can assume relevant employee and project details) and write the SQL queries for the above-mentioned requests.
Assume the following tables:
project_id | project_name |
---|---|
1001 | "Mobile App Development" |
1002 | "Web Portal Revamp" |
1003 | "Backend Infrastructure Update" |
employee_id | employee_name |
---|---|
10 | "John Doe" |
20 | "Jane Smith" |
30 | "Larry Davis" |
timesheet_id | employee_id | project_id | hours_logged | log_date |
---|---|---|---|---|
9001 | 10 | 1001 | 6 | "07/01/2022" |
9002 | 10 | 1002 | 3 | "07/02/2022" |
9003 | 20 | 1001 | 5 | "07/01/2022" |
9004 | 10 | 1001 | 7 | "07/02/2022" |
9005 | 30 | 1003 | 4 | "07/03/2022" |
These queries help in analyzing the project resource usage by giving insights into the total hours logged per project and individuals contributing the most hours to a specific project.
The EXCEPT operator is used to return all rows from the first SELECT statement that are not returned by the second SELECT statement. Note that is available in PostgreSQL and SQL Server, and it's equivalent operator is called and is available in MySQL and Oracle.
For a tangible example, suppose you were doing an HR Analytics project for Endava, and had access to Endava's contractors and employees data. Assume that some employees were previously contractors, and vice versa, and thus would show up in both tables. You could use operator to find all employees who never were a contractor using this query:
Endava is an IT services company specializing in application development, independent testing, cloud services, and IT infrastructure services. They want to understand better their service delivery performance in different countries and sectors.
Assume that you have a table named with the following structure:
customer_id | country | sector | service | billing_cycle | payment_due_date | amount_due |
---|---|---|---|---|---|---|
101 | USA | Banking | Application Development | Monthly | 2022-09-30 | 1200.00 |
102 | UK | Telecom | Independent Testing | Quarterly | 2022-08-31 | 3600.00 |
103 | UK | Banking | Cloud Services | Annually | 2022-03-31 | 5000.00 |
104 | USA | Telecom | IT Infrastructure Services | Quarterly | 2022-11-30 | 3000.00 |
105 | Canada | Retail | Application Development | Annually | 2023-02-28 | 10000.00 |
In the table, write an SQL query that filters the data based on the following conditions:
The result should be ordered by amount_due in descending order.
This query uses the WHERE clause combined with IN and AND operations to filter the records based on multiple conditions. The EXTRACT function is used to filter records based on the year of the payment due date. The ORDER BY clause is used to order the result in descending order by amount due.
While both types of indexes improve the performance of SQL queries by providing a faster way to lookup rows of data, a unique index enforces the uniqueness of the indexed columns, meaning that no duplicate values are allowed in the indexed columns. On the other hand, a non-unique index allows duplicate values in the indexed columns.
Here is an example of a unique index on the column of a table of Endava employees:
This index would ensure that no two Endava employees have the same , which could be used as a unique identifier for each employee.
Here is an example of a non-unique index on the column of the same table:
This index would not enforce uniqueness, but it could be used to improve the performance of queries that filter or sort the data based on the column. For example, if you want to retrieve all employees who were Data Analysts, the database can use the index to quickly locate and retrieve the desired records without having to search through all the employees.
As an SQL developer at Endava, often you will be tasked with fetching specific information from the company's database. Assume that we have a table where we store information about all of our customers. Your task is to write a SQL query that fetches all records of customers from London. You should make use of the SQL command.
customer_id | first_name | last_name | location | |
---|---|---|---|---|
201 | James | Smith | james.smith@email.com | London |
202 | Jack | Taylor | jack.taylor@email.com | Manchester |
203 | Emma | Johnson | emma.johnson@email.com | London |
204 | Sophia | Brown | sophia.brown@email.com | Bristol |
205 | Oliver | Wilson | oliver.wilson@email.com | London |
Here is how you would structure your SQL query in PostgreSQL:
This query will return all the columns for the customers located in London. The operator is used in the clause to search for a specified pattern in a column.
customer_id | first_name | last_name | location | |
---|---|---|---|---|
201 | James | Smith | james.smith@email.com | London |
203 | Emma | Johnson | emma.johnson@email.com | London |
205 | Oliver | Wilson | oliver.wilson@email.com | London |
Endava has a system where employees log their tasks on various projects running in the company. As a data analyst in Endava, you are tasked with writing a SQL query that will identify the employees with the most number of tasks in each project. You have access to two tables: table which holds employee details and table which logs the tasks.
id | first_name | last_name | department | |
---|---|---|---|---|
1001 | Jane | Doe | jane.doe@endava.com | Sales |
1002 | John | Smith | john.smith@endava.com | Human Resources |
1003 | Alice | Johnson | alice.johnson@endava.com | Engineering |
1004 | Robert | Brown | robert.brown@endava.com | Finance |
task_id | employee_id | project | task_completion_date |
---|---|---|---|
2001 | 1002 | Alpha | 2022-06-15 |
2002 | 1003 | Beta | 2022-07-02 |
2003 | 1001 | Alpha | 2022-08-01 |
2004 | 1004 | Gamma | 2022-08-10 |
2005 | 1002 | Alpha | 2022-09-20 |
You should return a table with the project and the name of the employee who has the most tasks in that project.
In case of tie, return the employee that appears first alphabetically (by first name).
This query will join the and tables on the and fields respectively. It then groups the results by project and uses the clause to filter out the groups that have the maximum count of tasks (which indicates that these are the employees with the most tasks). The results are then ordered alphabetically by employee name. Note that this query is simplified and assumes every task is unique and assigned to only one employee, for more complex scenarios the query may need adjustments.
Since joins come up frequently during SQL interviews, try this Spotify JOIN SQL question:
The clause in SQL allows you to select records that are unique, eliminating duplicates.
For example, if you had a table of Endava employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:
Endava is a software development company, and at the end of each project, clients assess the performance of each team member on different aspects, each aspect being scored out of 10. Endava now wants to calculate the average performance score of each employee, their highest and lowest score, and the range of these scores (highest - lowest). The final result should be rounded off to 2 decimal places.
review_id | employee_id | client_id | project_id | aspect_1 | aspect_2 | aspect_3 |
---|---|---|---|---|---|---|
1 | 101 | 201 | 5001 | 7 | 8 | 9 |
2 | 102 | 202 | 5002 | 8 | 6 | 7 |
3 | 101 | 203 | 5001 | 8 | 9 | 7 |
4 | 103 | 204 | 5003 | 6 | 7 | 6 |
5 | 102 | 205 | 5004 | 7 | 7 | 7 |
employee_id | avg_score | highest_score | lowest_score | score_range |
---|---|---|---|---|
101 | 8.00 | 9 | 7 | 2 |
102 | 7.00 | 8 | 6 | 2 |
103 | 6.33 | 7 | 6 | 1 |
This query calculates the average, highest, lowest and range of the employees' scores across all reviews. Each row in the output represents a single employee and their respective performance information. The ROUND function is used to reduce the decimal places in the average to 2. The GREATEST and LEAST functions are used to find the highest and lowest scores respectively, across all aspects. The range is calculated as the difference between the highest and lowest scores.
To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for calculating averages and rounding results, or this JPMorgan Chase Cards Issued Difference Question which is similar for finding highest and lowest values and calculating their difference.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Endava SQL interview is to solve as many practice SQL interview questions as you can! In addition to solving the earlier Endava SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like FAANG and tech startups.
Each DataLemur SQL question has multiple hints, full answers and most importantly, there's an online SQL coding environment so you can right online code up your query and have it executed.
To prep for the Endava SQL interview it is also a great idea to solve interview questions from other tech companies like:
However, if your SQL query skills are weak, forget about going right into solving questions – improve your SQL foundations with this free SQL tutorial.
This tutorial covers topics including window functions and using LIKE – both of these pop up often in Endava SQL assessments.
Besides SQL interview questions, the other topics covered in the Endava Data Science Interview are:
To prepare for Endava Data Science interviews read the book Ace the Data Science Interview because it's got: