At Nagarro, SQL is used for analyzing large datasets for actionable insights and managing data schemas in the tech consulting industry. That's the reason behind why Nagarro almost always evaluates jobseekers on SQL query questions in interviews for Data Science, Data Engineering and Data Analytics jobs.
So, to help you study, we've curated 11 Nagarro SQL interview questions – can you solve them?
Assume that Nagarro is an e-commerce company. Their product team wants to analyze user engagement on a rolling 3 months basis. Specifically, they want to calculate how many unique active users a product has on a rolling 3 months basis, sorted by date and product.
Use the schema and sample data from the reviews table provided below for this question.
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
6171 | 123 | 2022-01-08 | 50001 | 4 |
7802 | 265 | 2022-02-10 | 69852 | 4 |
5293 | 362 | 2022-03-18 | 50001 | 3 |
6352 | 192 | 2022-04-26 | 69852 | 3 |
4517 | 981 | 2022-05-05 | 69852 | 2 |
We define an active user as a user that has submitted at least one review in the timeframe. Note: use the to calculate the rolling 3 month window.
end_of_period | product_id | active_users |
---|---|---|
2022-03-31 | 50001 | 2 |
2022-03-31 | 69852 | 1 |
2022-04-30 | 50001 | 1 |
2022-04-30 | 69852 | 2 |
2022-05-31 | 50001 | 0 |
2022-05-31 | 69852 | 2 |
This query calculates the number of unique active users per product on a rolling 3 months basis. It uses a subquery to filter the relevant reviews for each rolling 3 month window, and then counts the distinct users. The results are ordered by the end of the rolling 3 month period in ascending order and product id in ascending order. Keep in mind that if a user has not submitted a review for a product in a given 3 month period, the product will not appear in the results for that period.
Pro Tip: Window functions are a frequent SQL interview topic, so practice all the window function problems on DataLemur
Assume you had a table of Nagarro employee salary data. Write a SQL query to find the 2nd highest salary at the company.
employee_id | salary |
---|---|
1 | 2500 |
2 | 800 |
3 | 1000 |
4 | 1200 |
second_highest_salary |
---|
1200 |
Try this question and run your code right in the browser:
You can find a detailed solution here: 2nd Highest Salary.
A value represents a missing or unknown value. It is different from a zero or a blank space, which are actual values.
It's important to handle values properly in SQL because they can cause unexpected results if not treated correctly. For example, comparisons involving values always result in . Also values are not included in calculations. For example, will ignore values in the column.
Nagarro is a global software development and technology consultancy company. To manage its vast pool of employees and their projects efficiently, Nagarro is thinking of developing an Employee Management System. Design the database for this system. The database should be capable of storing details about employees, their departments, projects they are assigned to, and their roles in those projects. Also, write an SQL query to retrieve a list of all the employees who are working on more than one project.
Consider the following tables:
employee_id | first_name | last_name | department_id |
---|---|---|---|
1 | John | Doe | 100 |
2 | Jane | Smith | 200 |
3 | Jim | Brown | 300 |
department_id | department_name |
---|---|
100 | Engineering |
200 | Marketing |
300 | Human Resources |
project_id | project_name | department_id |
---|---|---|
50 | System Design | 100 |
60 | Brand Campaign | 200 |
employee_id | project_id |
---|---|
1 | 50 |
2 | 60 |
1 | 60 |
3 | 50 |
This query joins the and tables on the field. It groups the result by employee's first and last name, and then filters the groups to only include those where an employee is working on more than one project. This is achieved by using the clause to filter the result of the clause. The result is a list of employees who are working on multiple projects.
While both the and operators are used to filter data based on some criteria, selects for values within a given range, whereas for it checks if the value is in a given list of values.
For example, say you had a table called , which had the salary of the employee, along with the country in which they reside.
To find all employees who made between 120k, you could use the operator:
To find all employees that reside in the US or Canada, you could use the operator:
You are a data analyst at Nagarro. The data management team has a table consisting of customer records that includes their contact information and the date they joined the company. Your goal is to design a SQL query that filters this table to return only the customers who joined the company after 1st January 2020 and are either from the US or have an email with domain '@nagarro.com'.
The following table presents an example of the table:
Example Input:
customer_id | first_name | last_name | Country | join_date | |
---|---|---|---|---|---|
001 | John | Doe | john.doe@nagarro.com | Germany | 2020-03-01 |
002 | Jane | Smith | jane.smith@yahoo.com | US | 2019-11-11 |
003 | James | Johnson | j.johnson@hotmail.com | US | 2021-01-01 |
004 | Patricia | Williams | p.williams@gmail.com | Canada | 2020-06-15 |
005 | Robert | Brown | robert.brown@nagarro.com | England | 2021-07-01 |
With this query, you will get a list of customers who joined the company after January 1, 2020, and are either from the United States or have an email account with '@nagarro.com' domain. The 'LIKE' operator is used to match text string patterns. In our case, any email that ends with '@nagarro.com'. The 'AND' and 'OR' operators used in the where clause filter the records based on multiple conditions.
Note: interviews at Nagarro often aren't trying to test you on a specific flavor of SQL. As such, you don't need to exactly know that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle – you just need to know the general concept!
Your answer should mention that the / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.
Here's a PostgreSQL example of using EXCEPT to find all of Nagarro's Facebook video ads with more than 50k views that aren't also being run on YouTube:
If you want to retain duplicates, you can use the EXCEPT ALL operator instead of EXCEPT. The EXCEPT ALL operator will return all rows, including duplicates.
You are given a table 'projects' in Nagarro's database. Each row of the table represents a project carried out by the company. The table has the following Columns: project_id, start_date, end_date. Can you write a SQL query that gives the average duration (in days) of all projects undertaken by Nagarro in the year 2020?
project_id | start_date | end_date |
---|---|---|
1 | 2020-01-05 | 2020-03-10 |
2 | 2020-07-01 | 2020-09-30 |
3 | 2020-04-03 | 2020-05-12 |
4 | 2020-12-19 | 2021-01-10 |
5 | 2019-11-23 | 2019-12-02 |
avg_duration_days |
---|
78 |
In the above SQL query, we are extracting the project durations by finding the difference of and . We use the function to get the number of DAYS difference from these dates. We filter out projects only for the year 2020 using WHERE clause. Then, we calculate the average of these durations using the function. The output will be the average project duration for Nagarro in 2020. Please note, this query assumes that the and are of type .
To practice a very similar question try this interactive Facebook Average Post Hiatus (Part 1) Question which is similar for calculating duration between dates or this Facebook Active User Retention Question which is similar for date-based analysis.
Nagarro, being a custom software development and services company, provides many software services and solutions to their clients. Let's assume that we want to find the product that has been sold the most for each month.
Consider a scenario where we have a sales table in the database that keeps track of all software products sold, the client who bought them, and the date of purchase.
sale_id | client_id | purchase_date | product_id | quantity |
---|---|---|---|---|
101 | 567 | 06/05/2022 | A01 | 5 |
102 | 672 | 06/14/2022 | A03 | 10 |
103 | 890 | 06/30/2022 | A01 | 7 |
104 | 567 | 07/11/2022 | A02 | 9 |
105 | 890 | 07/16/2022 | A01 | 6 |
The task is to write an SQL query to find the software product that has been sold the most for each month. The output should provide the month, product_id, and sum of quantity sold.
mth | product_id | maximum_sold |
---|---|---|
6 | A01 | 12 |
7 | A02 | 9 |
This SQL query makes use of the function to get the month from the date of the sales data. We then group by both the extracted month and product_id to get the sum of the quantity sold for each product in each month. The clause is used to sort the results in descending order of the quantity sold.
While both types of indexes improve the performance of SQL queries by providing a faster way to lookup rows of data, a unique index enforces the uniqueness of the indexed columns while a non-unique index allows duplicate values in the indexed columns.
Suppose you had a table of Nagarro employees. Here's an example of a unique index on the column:
This index would ensure that no two Nagarro employees have the same , which could be used as a unique identifier for each employee.
Here's a non-unique index example example on the column:
This index would not enforce uniqueness, but it could be used to improve the performance of queries that filter or sort the data based on the column. For example, if you want to quicklly retreive all Data Scientists, the database can use the index to efficiently locate and retrieve the desired records without having to do a full table scan on all Nagarro employees.
Given a table named , your task is to write a SQL query that will fetch all the records of customers whose email addresses end with "@nagarro.com". {#Question-11}
customer_id | first_name | last_name | email_address |
---|---|---|---|
09876 | John | Doe | john.doe@nagarro.com |
87654 | Sally | Mae | sally.mae@yahoo.com |
56789 | Sam | Smith | sam.smith@nagarro.com |
45678 | Alex | Johnson | alex.johnson@gmail.com |
34567 | Emma | Lee | emma.lee@nagarro.com |
You can make use of the keyword in SQL to filter the records based on a specific pattern. To get the required results, you can use the following PostgreSQL query:
This query will return all the records where the email address column ends with "@nagarro.com". The percent sign (%) is a wildcard character that matches any sequence of characters.
customer_id | first_name | last_name | email_address |
---|---|---|---|
09876 | John | Doe | john.doe@nagarro.com |
56789 | Sam | Smith | sam.smith@nagarro.com |
34567 | Emma | Lee | emma.lee@nagarro.com |
The best way to prepare for a Nagarro SQL interview is to practice, practice, practice. Beyond just solving the earlier Nagarro SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon.
Each SQL question has hints to guide you, step-by-step solutions and most importantly, there's an interactive SQL code editor so you can instantly run your SQL query answer and have it checked.
To prep for the Nagarro SQL interview it is also helpful to solve SQL questions from other tech companies like:
However, if your SQL coding skills are weak, forget about jumping right into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers things like UNION vs. joins and Self-Joins – both of which come up often in SQL interviews at Nagarro.
Beyond writing SQL queries, the other types of questions covered in the Nagarro Data Science Interview are:
I'm sorta biased, but I think the optimal way to prep for Nagarro Data Science interviews is to read my book Ace the Data Science Interview.
The book has 201 data interview questions taken from companies like Microsoft, Google & Amazon. It also has a refresher covering Stats, SQL & ML. And finally it's vouched for by the data community, which is why it's earned 1000+ reviews on Amazon and a 4.5-star rating.