Data Science, Data Engineering and Data Analytics employees at KKR write SQL queries as a big part of their job. They use SQL for analyzing investment data trends, and managing financial databases. For this reason KKR often tests jobseekers SQL interview questions.
To help you prepare for the KKR SQL interview, we've curated 10 KKR SQL interview questions in this blog.
KKR is an investment firm that handles multiple clients who engage in various investment activities. The company wants to find out who their "whale clients" are - these are clients who have made a significant amount of investments in the last year.
As an interviewee, your task is to write a SQL query that returns a list of the top 10 clients with the highest total investment amount in the past year. For purpose of this exercise let's work with the "transactions" table, it has the following fields:
transaction_id | client_id | transaction_date | investment_id | investment_amount |
---|---|---|---|---|
1234 | 567 | 2021/01/01 | 8888 | 1000.00 |
1235 | 890 | 2021/02/10 | 9999 | 500000.00 |
1236 | 567 | 2021/03/15 | 8888 | 2000000.00 |
1237 | 234 | 2021/02/20 | 6666 | 15000.00 |
1238 | 890 | 2021/04/05 | 7777 | 70000.00 |
Here is an SQL query in PostgreSQL that solves this:
This query operates in the following way:
To solve a related customer analytics question on DataLemur's free interactive SQL code editor, try this Microsoft Teams Power User SQL Interview Question:
Assume there was a table of KKR employee salary data, along with which department they belonged to. Write a query to compare the average salary of employees in each department to the company's average salary for March 2024. Return the comparison result as 'higher', 'lower', or 'same' for each department. Display the department ID, payment month (in MM-YYYY format), and the comparison.
You can solve this problem directly within the browser on DataLemur:
The answer is LONG – 30+ lines of SQL. You can find a detailed solution here: Department vs. Company Salary.
Check out the KKR career page and see what role would be the best fit with your skills!
While both and are used to rank rows, the key difference is in how they deal with ties.
RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the 2nd row in the tie, and a rank of 4 to the the 3rd tie.
DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.
Suppose we had data on how many deals different salespeople at KKR:
To rank these salespeople, we could execute the following query:
The result of this query would be:
name | deals_closed | rank | dense_rank |
---|---|---|---|
Akash | 50 | 1 | 1 |
Brittany | 50 | 2 | 1 |
Carlos | 40 | 3 | 2 |
Dave | 40 | 4 | 3 |
Eve | 30 | 5 | 3 |
Farhad | 10 | 6 | 4 |
As you can see, the function assigns a rank of 1 to the first row (Akash), a rank of 2 to the second row (Brittany), and so on. In contrast, the , gives both Akash and Brittany a rank of 1, since they are tied, then moves on to rank 2 for Carlos.
KKR (Kohlberg Kravis Roberts) is a leading global investment firm. Let's say that you are a data analyst working for KKR and you have a dataset of different investments that the firm has made. Each row in the dataset represents a single investment and includes fields for the investment_id, sector (the industry of the investment), amount (the amount of money invested), and return (the return on investment).
The problem is to write a SQL query that calculates the total amount invested and the average return per sector, ordered by the total amount invested in descending order. Furthermore, for each sector, you should also display the percentage that sector's investment is of the total amount of all investments.
investment_id | sector | amount | return |
---|---|---|---|
001 | Healthcare | 100000 | 10% |
002 | Technology | 200000 | 20% |
003 | Real Estate | 300000 | 15% |
004 | Technology | 500000 | 30% |
005 | Healthcare | 300000 | 5% |
006 | Real Estate | 400000 | 10% |
sector | total_invested | avg_return | invested_pct |
---|---|---|---|
Technology | 700000 | 25% | 41.18% |
Real Estate | 700000 | 12.5% | 41.18% |
Healthcare | 400000 | 7.5% | 23.53% |
In this query, we first group the investments by sector, then we calculate the total amount invested and the average return for each sector. The window function is used to calculate the total sum of all investments and then divide each sector's total invested by the total amount to get the percentage that sector's investment is of the total. The CAST function is used to round the result to 2 decimal places. The result is then ordered by the total amount invested in descending order.
To solve a similar window function interview problem which uses RANK() on DataLemur's free online SQL coding environment, try this Amazon SQL question asked in a BI Engineer interview:
In database schema design, a one-to-one relationship between two entities is where each entity is associated with only one instance of the other entity. For example, the relationship between a car and a license plate is one-to-one, because each car can only have one licensce plate, and each licensce plate belongs to exactly one car.
On the other hand, a one-to-many relationship is where one entity can be associated with multiple instances of the 2nd entity. For example, a teacher can teach multiple classes, but each class is associated with only one teacher.
You are provided with a database of KKR's investment records. The data has the following structure:
investment_id | customer_id | investment_date | investment_type | amount |
---|---|---|---|---|
5421 | 132 | 12/06/2020 | Private Equity | 80000 |
7873 | 276 | 14/07/2021 | Real Estate | 45000 |
6914 | 376 | 10/05/2021 | Private Equity | 150000 |
5243 | 154 | 24/09/2020 | Private Equity | 200000 |
4698 | 982 | 01/01/2021 | Real Estate | 100000 |
KKR wants to filter down the investment records database to find all the customers who have invested more than 100,000 in 'Private Equity' after the date '01/01/2021'.
Use SQL commands to answer this question.
This SQL query works by using the WHERE clause to filter rows that meet the following conditions: the is 'Private Equity', the is greater than 100,000, and the is after '01/01/2021'. The '*' in the SQL query denotes that we want all the columns in the result.
No, in almost all cases, and for all practical purposes, and do NOT produce the same result.
While both are similar, in that they combine two tables, you can think of joins as increasing the width of the resulting table (you'll have more columns in the result set for a left/inner/right join), whereas a union is used to combine rows which increases the height of the result set but keeps the column count the same.
As a Financial Analyst at KKR, you are interested in seeing the average investment made by KKR in different business sectors over the years. Write a SQL query to find the average investment amount per business sector in USD across all investments KKR has made.
investment_id | sector | investment_date | investment_amount_usd |
---|---|---|---|
101 | Tech | 01/10/2015 | 50000000 |
205 | Healthcare | 06/05/2018 | 30000000 |
302 | Retail | 03/08/2012 | 25000000 |
403 | Tech | 10/04/2020 | 75000000 |
502 | Retail | 04/09/2016 | 40000000 |
sector | avg_investment_usd |
---|---|
Tech | 62500000 |
Healthcare | 30000000 |
Retail | 32500000 |
This query calculates the average investment amount in USD by sector. The investment data is grouped by the 'sector' column and the average is calculated for each group. This will give us the average investment in each sector.
To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for its focus on grouping and calculating figures based on each group or this Stripe Repeated Payments Question which is similar for its requirement to aggregate data based on specific conditions.
As a company, KKR needs to frequently reach out to their database of customers for marketing campaigns. For the upcoming campaign, they want to target users who have a business email id (say this is determined by their email ending with '@business.com'). Can you provide a PostgreSQL query which will filter down the list of customers with business email id's from KKR's customer database table?
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@business.com |
2 | Jane | Doe | jane.doe@business.com |
3 | Jim | Beam | jim.beam@gmail.com |
4 | Jack | Daniels | jack.daniels@business.com |
5 | Julia | Roberts | julia.roberts@gmail.com |
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@business.com |
2 | Jane | Doe | jane.doe@business.com |
4 | Jack | Daniels | jack.daniels@business.com |
This query finds all customer records in the 'customers' table where the 'email' field ends with '@business.com'. This is done by using the SQL keyword and the '%' wildcard character, which is used to represent zero, one, or multiple characters. So '%@business.com' means any string that ends with '@business.com'. Thus helping KKR filter out their target customer list for the marketing campaign.
The operator combines two or more results from multiple SELECT queries into a single result. If it encounters duplicate rows, the multiple copies are removed (so there's only one instance of each would-be duplicate in the result set). Here's an example of a operator which combines all rows from and (making sure each row is unique):
The operator is similar to the operator but it does NOT remove duplicate rows!
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the KKR SQL interview is to solve as many practice SQL interview questions as you can! Besides solving the above KKR SQL interview questions, you should also solve the 200+ DataLemur interview questions which come from companies like Google, Facebook, Microsoft and Amazon.
Each exercise has hints to guide you, detailed solutions and best of all, there's an online SQL coding environment so you can easily right in the browser your query and have it checked.
To prep for the KKR SQL interview you can also be a great idea to solve SQL questions from other investment management and private equity companies like:
However, if your SQL query skills are weak, don't worry about going right into solving questions – refresh your SQL knowledge with this free SQL tutorial.
This tutorial covers SQL topics like sorting results with ORDER BY and handling dates – both of these pop up frequently during KKR SQL assessments.
Besides SQL interview questions, the other question categories tested in the KKR Data Science Interview are:
To prepare for KKR Data Science interviews read the book Ace the Data Science Interview because it's got: