# 9 KPMG SQL Interview Questions (Updated 2024)

Updated on

August 11, 2024

At KPMG, SQL is used to extract and process client financial data, such as revenue growth and expense patterns, as well as to analyze nuanced data trends specific to the auditing and advisory industry, like identifying anomalies in financial statements. That is why KPMG often asks SQL questions during interviews for Data Analytics, Data Science, and Data Engineering jobs.

To help you ace the KPMG SQL interview, here's 9 KPMG SQL interview questions – can you solve them?

## 9 KPMG SQL Interview Questions

### SQL Question 1: Calculate Customer Lifetime Value (CLV)

As a Data Analyst in KPMG, you're given a database of transactions that customers have made. Each transaction record contains the customer_id, date of transaction, and revenue generated from the transaction.

Your task is to calculate the Customer Lifetime Value (CLV) for each customer. CLV is the total revenue a company can realistically expect from a single customer account. It considers a customer's revenue value and compares that number to the company's predicted customer lifespan. For this task, you'll not consider churn rate.

Assume each customer has at least 1 transaction. If the customer has more than 1 transaction calculate the CLV as the cumulative sum of the revenue for each customer over time. List the top 10 customers with the highest CLV.

##### Example Input:
transaction_idcustomer_idtransaction_daterevenue
132232019-08-22250.00
245632019-11-18300.00
332232020-02-14200.00
467122020-05-20150.00
545632020-08-11250.00
667122020-10-22300.00
732232021-03-14500.00
867122021-06-10350.00
945632021-09-02400.00
1067122021-12-14250.00
##### Expected Output:
customer_idCLV
3223950.00
4563950.00
67121050.00

This PostgreSQL query calculates the cumulative sum of the revenue for each customer in the transactions table, ordered by the transaction_date. It uses a window function to create a running total of revenue for each customer_id. The query then orders the results by the CLV in descending order and limits the output to the top 10 customers with the highest CLV.

p.s. Window functions show up pretty often during SQL interviews, so practice the 27+ window function questions on DataLemur

### SQL Question 2: Top Department Salaries

Given a table of KPMG employee salary data, write a SQL query to find the top 3 highest paid employees in each department.

#### KPMG Example Input:

employee_idnamesalarydepartment_id
1Emma Thompson38001
2Daniel Rodriguez22301
3Olivia Smith20001
4Noah Johnson68002
5Sophia Martinez17501
8William Davis68002
10James Anderson40001

#### Example Input:

department_iddepartment_name
1Data Analytics
2Data Science

#### Example Output:

department_namenamesalary
Data AnalyticsJames Anderson4000
Data AnalyticsEmma Thompson3800
Data AnalyticsDaniel Rodriguez2230
Data ScienceNoah Johnson6800
Data ScienceWilliam Davis6800

Code your solution to this question interactively on DataLemur:

We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.

If the code above is tough, you can find a step-by-step solution here: Top 3 Department Salaries.

### SQL Question 3: What are SQL constraints, and can you give some examples?

Think of SQL constraints like the rules of a game. Just like a game needs rules to keep things fair and fun, a database needs constraints to keep things organized and accurate.

There are several types of SQL constraints like:

NOT NULL: This constraint is like a bouncer at a nightclub - it won't let anything NULL through the door.

UNIQUE: This constraint is like a VIP list - only special, one-of-a-kind values get in.

PRIMARY KEY: This constraint is like an elected official - it's made up of NOT NULL and UNIQUE values and helps identify each row in the table.

FOREIGN KEY: This constraint is like a diplomatic ambassador - it helps establish relationships between tables.

CHECK: This constraint is like a referee - it makes sure everything follows the rules.

DEFAULT: This constraint is like a backup plan - it provides a default value if no other value is specified.

So, whether you're playing a game or organizing a database, constraints are an important part of the process!

### SQL Question 4: Average Hours Billed Per Project

KPMG as a consulting firm executes various projects for different clients. Each project often consists of multiple tasks and each task is assigned to one or more employees, where each employee can log the hours they've worked. For this question, we're interested in the average number of hours billed per project. Write a SQL query which calculates the average hours billed to each project.

##### Example Input:
project_idclient_nameproject_name
100AppleMarket Analysis
50010012310
5011012658
5021013626
5031021929
50410298111
##### Example Output:
project_idproject_nameavg_hoursBilled
100Market Analysis10.00
101Competitive Research7.00

In this query, we first join the projects and tasks tables on project_id. We then group by project_id and project_name to split our data into sections for each project. Within these sections, we use the AVG function to calculate the average hours billed to each project.

To practice a very similar question try this interactive Microsoft Teams Power Users Question which is similar for calculating usage metrics or this Amazon Server Utilization Time Question which is similar for summarizing time usage.

### SQL Question 5: When would you use the constraint?

A is a field in a table that references the of another table. It creates a link between the two tables and ensures that the data in the field is valid.

For example, if you have a table of KPMG customers and an orders table, the customer_id column in the orders table could be a that references the id column (which is the primary key) in the KPMG customers table.

The constraint helps maintain the integrity of the data in the database by preventing the insertion of rows in the table that do not have corresponding entries in the table. It also enforces the relationship between the two tables and prevents data from being deleted from the table if it is still being referenced in the table.

### SQL Question 6: Analysis of Click-through Rates for KPMG's Digital Marketing Campaigns

KPMG, a global network of professional firms providing Audit, Tax and Advisory services, has recently increased its online marketing efforts. As an analyst for KPMG, your task is to analyze the click-through rates for their ongoing digital marketing campaigns. This involves calculating the rate at which users who see digital ads on various platforms (Google, Facebook, LinkedIn, etc.) end up landing on KPMG's website.

After clicking on the ad, some users might further opt for a service enquiry.

Given two tables - and , write a SQL query that calculates the proportion of total users who clicked an ad () and subsequently made a service enquiry () for each platform.

##### Example Input:
enquiry_iduser_idenquiry_date
71245008/16/2022 00:00:00
802320007/16/2022 00:00:00
493155006/30/2022 00:00:00
12670007/02/2022 00:00:00
117980005/25/2022 00:00:00

This query joins the table with the table on , considering only the cases where the is after or on the .

Then it aggregates the data by , and computes the rate of enquiries per clicks for each platform by dividing the COUNT of DISTINCT s in the by the COUNT of DISTINCT s in the , multiplied by 100 to get the percentage rate. This gives us the 'click-to-enquiry' rate for each platform.

To solve a related SQL problem on DataLemur's free online SQL code editor, try this SQL interview question asked by Facebook:

### SQL Question 7: What's a database view?

Views are a lot like virtual tables, where you can take a base table and customize it (such as by hiding some data from non-admin users, or removing some random columns/rows based on business requirements).

Here's the PostgreSQL syntax for creating a view based on data in the table:

### SQL Question 8: Maximum audit risk by sector

Given a table , where each row represents an audit conducted by KPMG, find the maximum audit risk for each sector for audits that were completed during the year 2021.

The table has the following schema:

audit_idsectorcompany_idaudit_dateaudit_risk
##### Example Input:
audit_idsectorcompany_idaudit_dateaudit_risk
641Finance1002021-02-060.76
702IT1012021-04-110.89
529Finance1022021-06-180.85
635Manufacturing1032021-08-030.91
417IT1042021-05-200.72
##### Example Output:
sectormax_audit_risk
Finance0.85
IT0.89
Manufacturing0.91

Here is the PostgreSQL query which calculates the maximum audit risk by sector for the year of 2021:

The statement is used to group the audits by sector. The clause is used to filter out audits that were not performed in the specified year (2021 in this case). And finally, is used to retrieve the highest audit risk from each sector. As you might already know, is a PostgreSQL function that returns the year part of a date as an integer.

### SQL Question 9: Finding Customers with Partial Email Domain

KPMG's Marketing Department would like to run an email campaign targeting their customers whose email providers are Gmail. They've noticed that a significant portion of their customer base is using Gmail, and they would like to build a specific marketing strategy around this. Therefore, the task is to find all the customers whose email IDs are registered with a '@gmail.com' domain.

##### Example Input:
customer_idfirst_namelast_nameemail
6171JohnDoejohn.doe@gmail.com
7802JaneSmithjane.smith@yahoo.com
5293RichardRoerichard.roe@hotmail.com
6352JamesBondjames.bond@gmail.com
4517AliceCooperalice.cooper@gmail.com
##### Example Output:
customer_idfirst_namelast_nameemail
6171JohnDoejohn.doe@gmail.com
6352JamesBondjames.bond@gmail.com
4517AliceCooperalice.cooper@gmail.com

This query uses the SQL keyword to filter rows from the customers database table. It matches customer records where the email column ends with '@gmail.com'. Therefore, it returns all customer records that are associated with a Gmail account. The percent symbol (%) is used as a wildcard character that can match any sequence of characters, and the at sign (@) and the text 'gmail.com' form the specific pattern to be matched.

### Preparing For The KPMG SQL Interview

The key to acing a KPMG SQL interview is to practice, practice, and then practice some more! Beyond just solving the above KPMG SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Microsoft, Google, Amazon, and tech startups.

Each DataLemur SQL question has hints to guide you, full answers and best of all, there's an interactive coding environment so you can right online code up your query and have it checked.

To prep for the KPMG SQL interview it is also helpful to practice SQL questions from other accounting & consulting companies like:

Find out how KPMG is using Artificial Intelligence to drive innovation, improve efficiency, and reduce costs!

However, if your SQL query skills are weak, forget about diving straight into solving questions – improve your SQL foundations with this DataLemur SQL Tutorial.

This tutorial covers SQL topics like AND/OR/NOT and CASE/WHEN/ELSE statements – both of which pop up often in SQL job interviews at KPMG.

### KPMG Data Science Interview Tips

#### What Do KPMG Data Science Interviews Cover?

Beyond writing SQL queries, the other types of questions covered in the KPMG Data Science Interview include:

#### How To Prepare for KPMG Data Science Interviews?

To prepare for KPMG Data Science interviews read the book Ace the Data Science Interview because it's got:

• 201 interview questions sourced from FAANG tech companies
• a crash course covering Stats, SQL & ML
• over 1000+ 5-star reviews on Amazon

Don't ignore the behavioral interview – prep for that with this guide on acing behavioral interviews.