At New York Life Insurance, SQL is used for extracting policyholder data, including policy coverage and claims history, for risk analysis, as well as managing structured insurance claims data, such as claim amounts and adjuster notes, for fraud detection. That is the reason why New York Life includes SQL questions in interviews for Data Analyst, Data Science, and BI jobs.
To help you study for the New York Life SQL interview, we've collected 9 New York Life Insurance SQL interview questions – scroll down to start solving them!
New York Life provides insurance services to millions of users. Each policyholder pays a premium on the policies they hold, and occasionally, policyholders submit claims to their insurance policies. The data team is interested in understanding the historical average of premiums paid and claims applied for each age group over different months.
Here's a sample representation of the and tables:
policy_id | policy_holder_id | policy_premium | policy_start_date | policy_holder_age |
---|---|---|---|---|
456721 | A123 | 500 | 02/01/2022 | 24 |
470202 | B354 | 750 | 03/10/2022 | 39 |
412393 | Z892 | 600 | 03/18/2022 | 30 |
453532 | C192 | 1000 | 04/26/2022 | 24 |
441517 | P981 | 900 | 05/05/2022 | 39 |
claim_id | policy_holder_id | claim_amount | claim_date |
---|---|---|---|
1201 | A123 | 300 | 02/15/2022 |
1302 | B354 | 500 | 03/20/2022 |
1403 | Z892 | 450 | 03/28/2022 |
1502 | A123 | 700 | 04/29/2022 |
1607 | P981 | 600 | 05/09/2022 |
We are required to calculate the average premium paid and claim applied by each age group, for each month.
This query partitions the and tables by month (using the function to ignore day and time) and . It applies the window function to calculate the average premium paid and claim applied for each group.
For more window function practice, try this Uber SQL problem within DataLemur's interactive coding environment:
You're given a table of New York Life employee and department salary information. Write a SQL query to find departments where the departments average salary is higher than the company's average salary.
Code your solution to this question interactively on DataLemur:
The solution is LONG – 30+ lines of SQL. You can find a detailed solution here: Department Salaries.
For all practical purposes, and do NOT produce the same results.
While both are similar in that they combine two tables, you can think of joins as increasing the width of the resulting table (you'll have more columns in the result set for a left/inner/right join), whereas a union is used to combine rows which increases the height of the result set but keeps the column count the same.
If your familiar with set operations, you can think of a as set addition, whereas a is set multiplication.
New York Life is a leading insurance company. They sell various types of policies to individuals and businesses. They want to manage their data more efficiently by designing a new database.
As a database designer, propose a table structure with some key tables and their relationships. The business requirements you should consider include policy information, policyholders’ data (individuals and businesses), agents data who sell these policies, and each policy's premium details per month. Implement some of the key queries that can be run on this proposed structure for demonstration purposes.
id | name | address | type |
---|---|---|---|
5671 | John Smith | 123 Main St, New York, NY | individual |
5672 | Acme Corp. | 456 Oak Ave, Albany, NY | business |
id | name | contact_number |
---|---|---|
321 | Jane Doe | 123-456-7890 |
322 | Bob Billy | 987-654-3210 |
policy_number | policy_type | policyholder_id | agent_id |
---|---|---|---|
A1001 | Life insurance | 5671 | 321 |
A1002 | Auto insurance | 5672 | 322 |
policy_number | month_year | premium_amount |
---|---|---|
A1001 | 01/2022 | $50 |
A1002 | 01/2022 | $75 |
To calculate total monthly premiums for each policy type sold by each agent:
This query joins three tables: , , and to calculate total premiums for each policy type sold by each agent. Premiums are summed up for the given month and year (in this case, January 2022). The results are grouped by agent's name and policy type. The output will display the agent's name, policy type, and total premium for each policy type sold by each agent.
In SQL, both a left and right join is used to combine rows from different tables based on a shared key or set of keys, resulting in a single merged table of data. However, here's the difference:
: Retrieves all the records/rows from the left and the matched records/rows from the right table.
: Retrieves all the records/rows from the right and the matched records/rows from the left table.
How would you find out the average length (in years) of all life insurance policies held by customers at New York Life?
Sample data for this question would be in the table.
policy_id | customer_id | start_date | end_date |
---|---|---|---|
101 | 123 | 2010-06-15 | 2020-06-15 |
102 | 456 | 2012-10-20 | 2017-10-20 |
103 | 789 | 2015-01-01 | 2025-01-01 |
104 | 456 | 2018-05-30 | 2023-05-30 |
105 | 789 | 2019-12-01 | 2024-12-01 |
avg_policy_length |
---|
7.2 |
This PostgreSQL query uses the function to find the average difference in years between the end date and start date of all policies. It will return a single number, the average policy length. Note that the 'year' argument to will extract the number of years between the two dates.
To practice a very similar question try this interactive Facebook Average Post Hiatus (Part 1) Question which is similar for calculating duration between two events or this Alibaba Compressed Mean Question which is similar for calculating mean values.
Using a join in SQL, you can retrieve data from multiple tables and merge the results into a single table.
In SQL, there are four distinct types of JOINs. To demonstrate each kind, Imagine you had two database tables: an table that contains data on Google Ads keywords and their bid amounts, and a table with information on product sales and the Google Ads keywords that drove those sales.
: An INNER JOIN retrieves rows from both tables where there is a match in the shared key or keys. For example, an INNER JOIN between the table and the table could be performed using the keyword column as the shared key. This would retrieve only the rows where the keyword in the table matches the keyword in the table.
: A LEFT JOIN retrieves all rows from the left table (in this case, the table) and any matching rows from the right table (the Sales table). If there is no match in the right table, values will be returned for the right table's columns.
: A RIGHT JOIN retrieves all rows from the right table (in this case, the Sales table) and any matching rows from the left table (the table). If there is no match in the left table, values will be returned for the left table's columns.
: A FULL OUTER JOIN retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, values will be returned for the columns of the non-matching table.
Given a table of life insurance policies sold by agents, find the total amount of policies sold by each agent.
The table has the columns , , , , and .
The table has the columns and .
policy_id | agent_id | client_id | purchase_date | amount |
---|---|---|---|---|
201 | 362 | 123 | 01/01/2021 | 50000 |
202 | 265 | 456 | 02/02/2021 | 75000 |
203 | 362 | 789 | 03/03/2021 | 150000 |
204 | 362 | 012 | 04/04/2021 | 200000 |
205 | 265 | 345 | 05/05/2021 | 100000 |
agent_id | agent_name |
---|---|
362 | John Doe |
265 | Jane Smith |
agent_name | total_policies_sold | total_amount_sold |
---|---|---|
John Doe | 3 | 400000 |
Jane Smith | 2 | 175000 |
This SQL question tests the GROUP BY clause and aggregate functions COUNT and SUM. It groups the policies_sold data by each agent and then counts the number of policies and calculates the sum of the amounts for each grouped agent.
In the below database table for New York Life Insurance company, the 'location' column has the format "city, state". You're required to retrieve all customers who are from New York, but 'New York' may appear either as city or as state in the 'location' column.
customer_id | first_name | last_name | location | policy_number |
---|---|---|---|---|
1056 | James | Brown | New York, NY | NYL001 |
2087 | Susan | Smith | Albany, NY | NYL002 |
3098 | Brian | Johnson | Buffalo, NY | NYL003 |
4189 | Emma | Jones | York, PA | NYL004 |
5370 | David | Davis | Albany, NY | NYL005 |
customer_id | first_name | last_name | location | policy_number |
---|---|---|---|---|
1056 | James | Brown | New York, NY | NYL001 |
2087 | Susan | Smith | Albany, NY | NYL002 |
3098 | Brian | Johnson | Buffalo, NY | NYL003 |
5370 | David | Davis | Albany, NY | NYL005 |
This query uses the LIKE keyword with a wildcard (%) to match any records where the location starts with 'New York, ' (indicating New York City) or ends with ', NY' (indicating any other city in the state of New York).
The best way to prepare for a New York Life SQL interview is to practice, practice, practice. Besides solving the earlier New York Life SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Microsoft, Google, Amazon, and tech startups.
Each problem on DataLemur has hints to guide you, step-by-step solutions and crucially, there is an online SQL code editor so you can right online code up your query and have it graded.
To prep for the New York Life SQL interview you can also be helpful to solve SQL questions from other insurance companies like:
Explore the future of insurance with New York Life's latest advancements in Artificial Intelligence and Machine Learning!
In case your SQL skills are weak, don't worry about jumping right into solving questions – improve your SQL foundations with this free SQL tutorial.
This tutorial covers SQL concepts such as filtering data with WHERE and finding NULLs – both of these show up often during SQL job interviews at New York Life.
Beyond writing SQL queries, the other types of problems to practice for the New York Life Data Science Interview include:
To prepare for New York Life Data Science interviews read the book Ace the Data Science Interview because it's got:
Also focus on the behavioral interview – prepare for that using this guide on behavioral interview questions.