At American International Group, SQL is used for analyzing risk and insurance data trends by tracking policy lapse rates and identifying emerging risks. It is also used for automating data processing tasks, such as data quality checks and data validation, for efficient decision-making, which is why AIG asks SQL questions in interviews for Data Science, Analytics, and & Data Engineering jobs.
As such, to help you ace the AIG SQL interview, we've curated 9 American International Group SQL interview questions in this blog.
The Insurance company AIG (American International Group) wants to identify their high-value clients. High-value clients are defined as clients that have purchased a high number of policies (more than 5) and have a low claim frequency (less than 0.3 times per policy).
To solve this, you're given two tables - and .
client_id | client_name | date_of_birth |
---|---|---|
101 | John Smith | 1980-01-01 |
102 | Jane Doe | 1989-03-05 |
103 | Bob Johnson | 1975-08-15 |
104 | Alice Williams | 1992-12-19 |
policy_id | client_id | policy_type | purchase_date | claims_made |
---|---|---|---|---|
20001 | 101 | Car Insurance | 2020-01-01 | 2 |
20002 | 101 | Home Insurance | 2021-02-05 | 1 |
20003 | 102 | Life Insurance | 2022-01-01 | 0 |
20004 | 104 | Travel Insurance | 2020-08-10 | 1 |
20005 | 104 | Health Insurance | 2022-02-08 | 0 |
20006 | 104 | Car Insurance | 2021-07-03 | 1 |
20007 | 101 | Travel Insurance | 2020-06-11 | 0 |
20008 | 101 | Health Insurance | 2019-04-11 | 1 |
20009 | 103 | Life Insurance | 2020-10-01 | 5 |
20010 | 101 | Life Insurance | 2021-03-08 | 1 |
You need to write a SQL query to fetch the and of the high-value clients. This can be achieved by grouping all rows by in the table and calculating the sum of policies and the average of claims.
Here is the SQL code block that accomplishes this:
This query first creates an inner query that calculates the count of policies and the average of claims for each client. This result is then joined with the clients table using to fetch the names of the resulting clients. Finally, the clause is used to filter the result to only include high-value clients.
To practice a similar customer analytics SQL question where you can solve it right in the browser and have your SQL solution instantly graded, try this Walmart Labs SQL Interview Question:
Assume you had a table of AIG employee salaries, along with which department they were in. Write a query to compare the average salary of employees in each department to the company's average salary. Return the comparison result as 'higher', 'lower', or 'same' for each department. Display the department ID, payment month (in MM-YYYY format), and the comparison.
Solve this question interactively on DataLemur:
The solution is LONG – 30+ lines of SQL. You can find a detailed solution with hints here: Department Salaries.
A constraint ensures that all values in a column are different. It is often used in conjunction with other constraints, such as NOT NULL, to ensure that the data meets certain conditions.
For example, if you had AIG employee data stored in a database, here's some constraints you'd use:
In the AIG employee example, the UNIQUE constraint is applied to the "email" field to ensure that each employee has a unique email address. This helps to ensure the integrity of the data in the database and prevents errors that could occur if two employees had the same email address.
Assume AIG wants to analyze the amount of claims by insurance type and the month in which the claim was made throughout the year. Write a SQL query to find the total claim amount for each insurance type for each month. The month should not include the year.
Assuming we have a table with the following format:
claim_id | policy_id | claim_date | claim_amt | insurance_type |
---|---|---|---|---|
1 | 101 | 2022-01-12 | 3000 | auto |
2 | 102 | 2022-01-15 | 5000 | home |
3 | 103 | 2022-02-03 | 1500 | auto |
4 | 104 | 2022-02-07 | 4500 | home |
5 | 105 | 2022-03-01 | 1000 | auto |
The output should look like:
month | insurance_type | total_claim_amt |
---|---|---|
January | auto | 3000 |
January | home | 5000 |
February | auto | 1500 |
February | home | 4500 |
March | auto | 1000 |
March | home | 0 |
To get the total claim amount for each insurance type for each month, you can use the following PostgreSQL query:
In the query above, we're using the window function that operates on a set of rows (the "window") that are related to the current row. is used to create a "window"—a subset of the data—over which the function operates. Here the window consists of rows that share the same insurance type and month. This gives us a sum of the claim amount by insurance type and month. The clause sorts the result set by the month. The function is used to format the date as a month name.
To solve a similar window function question on DataLemur's free online SQL code editor, try this Google SQL Interview Question:
A is a field in a table that references the of another table. It creates a link between the two tables and ensures that the data in the field is valid.
For example, if you have a table of AIG customers and an orders table, the customer_id column in the orders table could be a that references the id column (which is the primary key) in the AIG customers table.
The constraint helps maintain the integrity of the data in the database by preventing the insertion of rows in the table that do not have corresponding entries in the table. It also enforces the relationship between the two tables and prevents data from being deleted from the table if it is still being referenced in the table.
AIG, as a major insurance company, has to manage thousands of claims daily. As a database administrator, your job is to design a database that will efficiently handle this information. The database should contain information about the policyholders, their policies, and any claims associated with those policies.
Let's design three tables , , and in such a way that they reflect the relationships among policyholders, their policies, and any claims they make.
holder_id | name | address | phone_number |
---|---|---|---|
101 | John Doe | 555 Maple St, New York, NY 10017 | (212) 555-1234 |
102 | Jane Smith | 123 Elm St, Los Angeles, CA 90001 | (310) 555-5678 |
103 | Bob Johnson | 789 Oak St, Chicago, IL 60601 | (312) 555-9101 |
policy_id | holder_id | policy_type | premium |
---|---|---|---|
201 | 101 | Auto | $1,500 |
202 | 102 | Life | $500 |
203 | 103 | Home | $3,000 |
claim_id | policy_id | date_filed | claim_amount |
---|---|---|---|
301 | 201 | 2/1/2022 | $10,000 |
302 | 202 | 3/1/2022 | $250,000 |
303 | 201 | 4/1/2022 | $5,000 |
Write a SQL query to fetch the total claim amount filed by each policyholder for the year 2022.
This query joins the three tables on their respective IDs and extracts the total claim amount for each policyholder for the year 2022. It uses the function to total the claim amount and the clause to group the result by policyholder.
Note: interviews at AIG often aren't trying to test you on a specific flavor of SQL. As such, you don't need to exactly know that is available in PostgreSQL and SQL Server, while MINUS is available in MySQL and Oracle – you just need to know the general concept!
Your answer should mention that the / operator is used to remove to return all rows from the first SELECT statement that are not returned by the second SELECT statement.
Here's a PostgreSQL example of using EXCEPT to find all of AIG's Facebook video ads with more than 50k views that aren't also being run on YouTube:
If you want to retain duplicates, you can use the EXCEPT ALL operator instead of EXCEPT. The EXCEPT ALL operator will return all rows, including duplicates.
For a large insurance company like AIG, a critical task might be to analyze the customer data and determine the number of policies each customer holds, and when they acquired their first and last insurance policies.
Consider the following two tables, and . The table holds customer data with a column for customer ID, and the table holds insurance policy data, associated with customer IDs (), the date when the policy was issued (), and the policy ID ().
customer_id | first_name | last_name |
---|---|---|
1001 | John | Doe |
1002 | Jane | Smith |
1003 | Sam | Brown |
policy_id | customer_id | policy_date |
---|---|---|
20001 | 1001 | 2018-06-20 |
20002 | 1001 | 2019-07-15 |
20003 | 1002 | 2020-04-12 |
20004 | 1002 | 2021-08-25 |
20005 | 1003 | 2018-09-30 |
20006 | 1003 | 2020-02-06 |
20007 | 1003 | 2021-10-20 |
Write a SQL query that shows the total count of policies each customer holds as well as the dates of their first and last policies.
This query will provide a consolidated customer data report. This report will include the customer's name and ID, the quantity of policies they hold, along with the dates of their earliest and latest acquired policies.
Since joins come up so often during SQL interviews, practice this SQL join question from Spotify:
AIG is an insurance company that has a database that stores the premium they receive from customers and the claims they pay out, for each policy in each year. Write a SQL query to calculate, for each insurance policy, the average premium, the total claims, and the ratio of total claims to total premiums (rounded to 2 decimal places). Use the , , , and operators in your answer. Ignore policies where the premium or claims are 0 or not available.
Consider the following markdown-formatted tables:
policy_id | year | premium | claims_paid |
---|---|---|---|
AIG001 | 2021 | 1500 | 800 |
AIG002 | 2021 | 2000 | 500 |
AIG003 | 2021 | 1000 | 1200 |
AIG001 | 2022 | 1700 | 0 |
AIG002 | 2022 | 2100 | 1300 |
AIG004 | 2022 | 0 | 0 |
policy_id | avg_premium | total_claims | claims_to_premium_ratio |
---|---|---|---|
AIG001 | 1600.00 | 800.00 | 0.50 |
AIG002 | 2050.00 | 1800.00 | 0.88 |
AIG003 | 1000.00 | 1200.00 | 1.20 |
This SQL query first filters out any policies where the premium or claims are 0. It then groups by insurance policy and calculates the average premium and total claims. It also calculates the ratio of total claims to total premium using those sums, using the function to avoid division by zero. If there is a 0 in the denominator, this will get replaced with a null, and no ratio will be calculated. The result is then rounded to 2 decimal places with the function.
To practice a very similar question try this interactive Alibaba Compressed Mean Question which is similar for calculations with rounding to decimal and this Amazon Average Review Ratings Question which is similar for asking about average and aggregate calculation.
The best way to prepare for a AIG SQL interview is to practice, practice, practice. Besides solving the earlier AIG SQL interview questions, you should also solve the 200+ SQL Interview Questions on DataLemur which come from companies like Microsoft, Google, and Meta.
Each interview question has hints to guide you, step-by-step solutions and best of all, there is an interactive coding environment so you can right online code up your SQL query answer and have it graded.
To prep for the AIG SQL interview you can also be wise to solve SQL questions from other insurance companies like:
Dive into the latest news and announcements from AIG and stay ahead of the curve in the insurance industry!
In case your SQL foundations are weak, don't worry about diving straight into solving questions – refresh your SQL knowledge with this DataLemur SQL Tutorial.
This tutorial covers topics including creating summary stats with GROUP BY and working with string/text data – both of which show up often during SQL job interviews at AIG.
Beyond writing SQL queries, the other types of problems covered in the AIG Data Science Interview include:
I believe the optimal way to prep for AIG Data Science interviews is to read the book I wrote: Ace the Data Science Interview.
It covers 201 interview questions sourced from companies like Google, Tesla, & Goldman Sachs. The book's also got a refresher on Stats, SQL & ML. And finally it's helped thousands of people land their dream job in data, which is why it's got over 1000+ 5-star reviews on Amazon.
While the book is more technical in nature, it's also crucial to prepare for the AIG behavioral interview. A good place to start is by understanding the company's unique cultural values.