At Freddie Mac, SQL is used for analyzing mortgage data trends and managing risk. For this reason Freddie Mac typically asks SQL query questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.
To help prep you for the Freddie Mac SQL interview, here's 10 Freddie Mac SQL interview questions – scroll down to start solving them!
Freddie Mac, a public government-sponsored enterprise, deals with a vast number of loans and mortgages. You are given a dataset containing historical loan performance for every loan Freddie Mac has given in a year. Let's analyze this data using SQL window functions to gain insights on the business.
Your task is to write a SQL query that gives the total number of active loans ( is "Active") by state for each quarter and rank the state by the number of active loans within that quarter.
loan_id | borrower_state | loan_status | loan_start_date |
---|---|---|---|
10001 | CA | Active | 2021-01-15 |
10002 | NY | Active | 2021-01-20 |
10003 | TX | Inactive | 2022-04-15 |
10004 | CA | Active | 2022-02-20 |
10005 | NY | Inactive | 2022-03-15 |
10006 | TX | Active | 2022-04-10 |
10007 | CA | Active | 2022-06-05 |
10008 | NY | Active | 2022-06-05 |
quarter | state | active_loan_count | rank |
---|---|---|---|
1 | CA | 1 | 1 |
1 | NY | 1 | 1 |
2 | CA | 1 | 1 |
2 | NY | 0 | 2 |
2 | TX | 1 | 2 |
3 | CA | 1 | 1 |
3 | NY | 1 | 2 |
3 | TX | 0 | 3 |
This query first truncates the to its quarter and then groups the result by the quarter and the . In each grouping, it counts the number of active loans. It also calculates the rank of each state based on the number of active loans within a quarter. The clause allows to apply the count only to rows where is "Active". The window function is used over the partitions formed by quarter.
To practice a similar window function question on DataLemur's free interactive coding environment, solve this Google SQL Interview Question:
Given a table of Freddie Mac employee salary data, write a SQL query to find employees who earn more money than their own manager.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
Write a SQL query for this question and run your code right in DataLemur's online SQL environment:
First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
If the solution above is tough, you can find a step-by-step solution here: Highly-Paid Employees.
The clause serves as a filter for the groups created by the clause, similar to how the clause filters rows. However, is applied to groups rather than individual rows.
For example, say you were a data analyst at Freddie Mac trying to understand how sales differed by region:
This query retrieves the total sales for all products in each region, and uses the clause to only sales made after January 1, 2023. The rows are then grouped by region and the clause filters the groups to include only those with total sales greater than $400,000.
Chck out this job listing for a Quantitiative Risk Analysis Tech Lead at Freddie Mac and see how your skills line up to the qualifications.
Freddie Mac is involved in the secondary mortgage market which includes purchasing, securitizing, and selling residential mortgages. Let's consider a scenario where Freddie Mac wants to analyze the performance of its mortgage loan portfolios. They need a report that lists the average monthly mortgage payments for each loan type (FHA, VA, Conventional) for each year.
The input table 'loans' comprises of columns such as , , , , .
Create an aggregated report which contains columns: , and .
The sample table is as follows:
The PostgreSQL query to achieve the required results would be:
This query extracts the year from the column, groups by the extracted year and and calculates the average for each group. The result is sorted by and to provide an orderly view of the payment performance for each loan type year by year.
The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.
For example, say you had stored some Facebook ad campaign data that Freddie Mac ran:
The CampaignID column is used to uniquely identify each row in the table, and the constraint ensures that there are no duplicate CampaignID values. This helps to maintain the accuracy of the data by preventing duplicate rows. The primary key is also an important part of the table because it enables you to easily identify and reference specific campaigns in your Facebook Ad data. You can use it to join to other tables in the database, such as a table with data on the results of the campaigns.
As a part of Freddie Mac's ongoing operations, the company needs to identify customers who currently have active loans. An active loan is defined as a loan that has not been fully paid off and is not marked as defaulted. Write a SQL query to retrieve customer records from the table which have an 'Active' status in the table.
customer_id | first_name | last_name | Date_of_Birth | |
---|---|---|---|---|
101 | John | Smith | 12/21/1970 | johnsmith@email.com |
102 | Jane | Doe | 05/29/1980 | janedoe@email.com |
103 | James | Brown | 07/03/1990 | jamesbrown@email.com |
loan_id | customer_id | loan_amount | loan_status |
---|---|---|---|
201 | 101 | 50000 | Active |
202 | 102 | 150000 | Default |
203 | 103 | 30000 | Active |
204 | 102 | 80000 | Paid Off |
205 | 101 | 20000 | Paid Off |
This query works by joining the table with the table on the field. The WHERE clause then filters the result set to only include customers who have at least one active loan according to the field in the table.
A cross join is a JOIN operation in SQL that creates a new table by pairing each row from the first table with every row from the second table. It is also referred to as a cartesian join. In contrast, a natural join combines rows from two or more tables based on their common columns, forming a new table. Natural joins are called "natural" because they rely on the natural relationship between the common columns in the joined tables.
Here's an example of a cross join:
Here's a natural join example using two tables, Freddie Mac employees and Freddie Mac managers:
This natural join returns all rows from Freddie Mac employees where there is no matching row in managers based on the column.
One significant difference between cross joins and natural joins is that the former do not require common columns between the tables being joined, while the latter do. Another distinction is that cross joins can generate very large tables if the input tables have a large number of rows, while natural joins only produce a table with the number of rows equal to the number of matching rows in the input tables.
Let's assume Freddie Mac is running a digital marketing campaign to attract prospective homebuyers to its loan packages. Marketing team is displaying various types of ads (email, web banner, Google Ads, etc.) Tracking of user interaction is taking place ie. how users are clicking on the ads (which represents interest), and how many are filling the loan interest form (which represents conversion).
The task is: For each ad type, calculate the click-through rate (CTR) which is defined as the number of users who filled the loan interest form after clicking on an ad, divided by the number of total unique ads clicked. Also, measure the performance of campaigns on weekly basis.
ad_id | user_id | click_date | ad_type | form_filled |
---|---|---|---|---|
9051 | 1001 | 01/09/2023 | Yes | |
2102 | 2050 | 01/09/2023 | Web Banner | No |
7893 | 3124 | 01/09/2023 | Google Ads | Yes |
6350 | 1722 | 02/09/2023 | Yes | |
4157 | 1981 | 03/09/2023 | Web Banner | No |
form_filled is 'Yes' if user filled the form after clicking the ad, else 'No'.
week | ad_type | ctr |
---|---|---|
36 | 1.00 | |
36 | Web Banner | 0.00 |
36 | Google Ads | 1.00 |
Let's structure the SQL query to filter unique clicks, count the conversions (form_filled) and calculate the CTR for each ad type, grouping them by week of the year:
This SQL query uses PostgreSQL's EXTRACT function to derive the week of year from the click_date. It uses CASE WHEN statement to count the form-filled clicks. As we are having distinct count in both numerator and denominator,it gives us the unique form filled conversions rate out of unique clicks across each Ad type and for each week.
To practice a similar SQL interview question on DataLemur's free interactive SQL code editor, try this Meta SQL interview question:
Given a table with columns , , , , , for Freddie Mac, a mortgage loan company, find out the average loan amount provided to customers for each month in the year 2021 who have not defaulted on their loans.
Consider the column has categories: 'NOT DEFAULTED' and 'DEFAULTED'. Consider the column as a timestamp.
loan_id | customer_id | loan_amount | loan_date | loan_status |
---|---|---|---|---|
7281 | 156 | 50000 | 2021-01-22 00:00:00 | NOT DEFAULTED |
2543 | 194 | 35000 | 2021-01-28 00:00:00 | NOT DEFAULTED |
9674 | 312 | 80000 | 2021-02-08 00:00:00 | DEFAULTED |
6382 | 231 | 60000 | 2021-02-11 00:00:00 | NOT DEFAULTED |
4991 | 436 | 45000 | 2021-02-25 00:00:00 | NOT DEFAULTED |
month | avg_loan_amount |
---|---|
1 | 42500.00 |
2 | 52500.00 |
The query uses the function to get the month and year from the timestamp in the field. It groups the result by the month and calculates the average loan amount for each group (using the aggregate function). The query considers only those loan data where the loan has not been defaulted and the year is 2021. It finally sorts the result by month.
A transaction is a one or more SQL commands which are executed as a singular unit if the transaction -commits- (or no execution hapens if the transaction -aborts-).
For transactions, a DBMS is supposed to enforce the follwing ACID properties: Atomicity, Consistency, Isolation, & Durability.
Here's what each one means:
As you can see, it's pretty important for the multiple databases where Freddie Mac store's it's data to be ACID-compliant!
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Freddie Mac SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the above Freddie Mac SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Netflix, Airbnb, and Amazon.
Each SQL question has multiple hints, full answers and crucially, there is an interactive SQL code editor so you can right online code up your SQL query answer and have it checked.
To prep for the Freddie Mac SQL interview it is also helpful to solve SQL problems from other mortgage & loan companies like:
But if your SQL foundations are weak, don't worry about going right into solving questions – refresh your SQL knowledge with this SQL interview tutorial.
This tutorial covers SQL topics like CASE/WHEN statements and window functions – both of these pop up frequently during Freddie Mac interviews.
Besides SQL interview questions, the other types of problems tested in the Freddie Mac Data Science Interview include:
The best way to prepare for Freddie Mac Data Science interviews is by reading Ace the Data Science Interview. The book's got: