logo

10 PennyMac SQL Interview Questions (Updated 2024)

Updated on

June 30, 2024

At PennyMac Financial Services, SQL is used for analyzing loan portfolio performance and managing customers databases. For this reason PennyMac LOVES to ask SQL query questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.

Thus, to help you study, we've curated 10 PennyMac Financial Services SQL interview questions – able to solve them?

PennyMac SQL Interview Questions

10 PennyMac Financial Services SQL Interview Questions

SQL Question 1: Find the Loan Amount Statistics for Each Loan Officer

In PennyMac, a mortgage company, there are multiple loan officers handling loans. For each loan officer, we want to analyze the total loan amount they handled and the average loan amount per loan that they managed. Each loan has a unique id, loan amount, and loan officer in charge.

Write a SQL query to produce the following output:

  • Loan Officer,
  • Total loan amount managed,
  • Average loan amount per loan,
  • Rank of the loan officer based on the total loan amount (1 being the highest)

Order your results based on the rank in ascending order.

Example Input:
loan_idloan_officer_idloan_amount
1011250000
1022300000
1031500000
1043400000
1052200000
Example Output:
loan_officer_idtotal_loan_amountaverage_loan_amountrank
17500003750001
34000004000002
25000002500003

Answer:


In this query, we group the data by loan officer id and calculate the total and average loan amount for each loan officer. We then use the window function RANK to rank the officers based on the total loan amount in descending order. The result is ordered by rank ascendingly.

To solve another window function question on DataLemur's free interactive SQL code editor, solve this Amazon BI Engineer interview question: Amazon SQL Interview Question

SQL Question 2: Top Department Salaries

Imagine there was a table of PennyMac employee salary data. Write a SQL query to find the top 3 highest paid employees in each department.

PennyMac Example Input:

employee_idnamesalarydepartment_id
1Emma Thompson38001
2Daniel Rodriguez22301
3Olivia Smith20001
4Noah Johnson68002
5Sophia Martinez17501
8William Davis68002
10James Anderson40001

Example Input:

department_iddepartment_name
1Data Analytics
2Data Science

Example Output:

department_namenamesalary
Data AnalyticsJames Anderson4000
Data AnalyticsEmma Thompson3800
Data AnalyticsDaniel Rodriguez2230
Data ScienceNoah Johnson6800
Data ScienceWilliam Davis6800

You can solve this question and run your code right in DataLemur's online SQL environment:

Top 3 Department Salaries

Answer:

We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.


If the code above is hard to understand, you can find a step-by-step solution with hints here: Top 3 Department Salaries.

SQL Question 3: What's a database view?

Views are a lot like virtual tables, where you can take a base table and customize it (such as by hiding some data from non-admin users, or removing some random columns/rows based on business requirements).

Here's the PostgreSQL syntax for creating a view based on data in the table:


PennyMac Financial Services SQL Interview Questions

SQL Question 4: Mortgages Management

PennyMac is a leading mortgage loan company. It provides a number of services to its users. Suppose you are tasked with maintaining the database that stores information about the mortgages for its users. The database should keep track of who took out the mortgage, mortgage details such as loan number, amount, start date and end date, property details, and payment records.

Create suitable tables for this scenario and write a SQL query to find out which user has taken out the most number of loans with PennyMac.

Example Input:

user_idfirst_namelast_name
123AliceBrown
456JohnSmith

Example Input:

mortgage_iduser_idloan_numberloan_amountstart_dateend_date
112300011000002022-01-012032-01-01
245600021500002022-01-012032-01-01
312300032000002022-06-012032-06-01

Example Input:

property_idmortgage_idaddressproperty_value
11123 Main St100000
22456 Main St150000
33789 Main St200000

Example Input:

payment_idmortgage_idpayment_datepayment_amount
112022-01-011000
212022-02-011000
322022-01-011500
422022-02-011500
532022-06-012000
632022-07-012000

Answer:


In this query, we first join the and tables on the field. We then group the results by , , and and count the number of mortgages each user has taken out. We order the results by the count in descending order and limit the result to the top 1 to find the user who has taken the most loans.

SQL Question 5: What's the difference between window functions and ?

In SQL, both and are used to rank rows within a result set. The key difference between the two functions is how deal with two or more rows having the same value in the ranked column (aka how the break ties).

RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the second row, and a rank of 4 to the third row.

DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.

SQL Question 6: Finding the Average Loan Amount

PennyMac, a finance company, would likely deal with loan-related data. Thus, you might be asked a question such as: "What is the Average Loan Amount taken by Customers every quarter?"

Example Input:
loan_idcustomer_idloan_dateloan_amount
00110101/04/2022150000
00210203/02/2022235000
00310104/05/2022145000
00410305/22/2022198000
00510205/25/2022245000
00610106/06/2022175000

We expect to calculate the average loan amount per quarter of the year.

Example Output:
quarter_of_yearavg_loan_amount
Q1192500
Q2172666.66

Answer:


This SQL query first classifies every loan's date to a quarter of the year by extracting the month from the loan date. It then calculates the average loan amount for each quarter by using the AVG aggregate function. Group By is used to group results by each quarter, and Order By is used to sort the resulting rows in ascending order of the quarter of the year.

To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for requiring aggregation of data over a time period or this Alibaba Compressed Mean Question which is similar for focusing on calculation of averages.

SQL Question 7: What is the purpose of a primary key in a database?

A primary key is a special type of column or set of columns in a database table that is used to identify each row in the table. It helps to ensure that all the rows are distinct and prevents null values. Additionally, primary keys play a crucial role in maintaining the accuracy of the data by making sure that there are no duplicate rows. They are also used to establish relationships between tables.

To create a primary key in a SQL database, you can use the constraint. For example, in the table , the column is the primary key.


SQL Question 8: Click-through-rates for PennyMac Ad Campaigns

You are a data analyst at PennyMac, a company providing home loans and refinancing services. The marketing department runs several digital ad campaigns to attract potential customers. You are tasked to analyse the click-through-rates of these campaigns.

You have the data in two tables - ad_campaigns and ad_clicks.

The ad_campaigns table contains information of each campaign:

Example Input:
campaign_idcampaign_start_datecampaign_end_datetotal_impressions
101/01/202201/31/202210000
202/01/202202/28/202212000
303/01/202203/31/202215000

The ad_clicks table contains data of each click event:

Example Input:
click_idcampaign_idclick_date
1101/01/2022
2101/04/2022
3101/10/2022
4202/05/2022
5202/20/2022
6303/03/2022
7303/10/2022
8303/25/2022

The click-through-rate (CTR) is calculated as the total number of clicks that your ads receive divided by the total number of impressions (views) that your ads get.

Calculate the click-through-rate (CTR) for each campaign.

Answer:


This query joins ad_campaigns and ad_clicks on and calculates the CTR for each campaign. It counts the number of clicks () for each campaign and divides it by the number of impressions provided by the campaign. The result is multiplied by 100 to obtain the CTR in percentage. The output is ordered by .

To solve a related problem on DataLemur's free interactive coding environment, solve this Meta SQL interview question: Facebook Click-through-rate SQL Question

Check out PennyMac's career page and see what type of role ould be the best fit for you!

SQL Question 9: Loan Applications Filter with LIKE Keyword

As a part of the SQL Analyst team at PennyMac, you have been provided with a database of customer records. Your task is to write a SQL query that allows you to filter down the customer loan applications and find records that contain specific text in the 'description' field. For this task, filter for any record that contains the word 'refinance' in its description.

Example Input:
app_idcustomer_idapplication_dateloan_amtdescription
1100106/08/2022$300,000Refinance existing mortgage
2100206/10/2022$500,000New home loan application
3100306/18/2022$350,000Refinance mortgage for lower rate
4100407/26/2022$250,000Looking to refinance home
5100507/05/2022$450,000Applying for first-time home loan

Answer:


This SQL query will filter down the data in the 'loan_applications' table and return any record where the description contains the word 'refinance'. In the provided sample input, the query would return the records for 1, 3, and 4. The '%' symbol in the LIKE clause operates as a wildcard that matches any sequence of characters. Thus, 'refinance' could appear anywhere in the 'description' field.

SQL Question 10: What does the keyword do?

The clause in SQL allows you to select records that are unique, eliminating duplicates.

For example, if you had a table of PennyMac employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:


PennyMac SQL Interview Tips

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the earlier PennyMac SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Netflix, Airbnb, and Amazon. DataLemur SQL and Data Science Interview Questions

Each DataLemur SQL question has hints to guide you, step-by-step solutions and best of all, there is an online SQL coding environment so you can right in the browser run your SQL query answer and have it graded.

To prep for the PennyMac SQL interview you can also be helpful to solve SQL problems from other mortgage & loan companies like:

But if your SQL coding skills are weak, don't worry about jumping right into solving questions – go learn SQL with this SQL tutorial for Data Analytics.

SQL tutorial for Data Scientists & Analysts

This tutorial covers SQL concepts such as using LIKE and aggregate functions like SUM()/COUNT()/AVG() – both of these pop up frequently during SQL job interviews at PennyMac.

PennyMac Financial Services Data Science Interview Tips

What Do PennyMac Data Science Interviews Cover?

In addition to SQL query questions, the other types of problems tested in the PennyMac Data Science Interview include:

  • Probability & Statistics Questions
  • Python or R Programming Questions
  • Product Analytics Questions
  • Machine Learning Questions
  • Resume-Based Behavioral Questions

PennyMac Data Scientist

How To Prepare for PennyMac Data Science Interviews?

To prepare for PennyMac Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from tech companies like Netflix, Google, & Airbnb
  • a crash course covering Stats, SQL & ML
  • over 1000+ 5-star reviews on Amazon

Ace the Data Science Interview Book on Amazon