Brown & Brown write SQL queries for analyzing insurance data trends, such as tracking policyholder behavior and identifying areas of risk, aw well as managing the storage of confidential client information, including encrypting sensitive data and controlling access permissions. This is the reason why Brown & Brown asks SQL questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.
To help you practice for the Brown & Brown SQL interview, here's 11 Brown & Brown SQL interview questions in this article.
Given Brown & Brown's customer and payment records, write a SQL query to identify the customers who have spent the most amount with the business in the past year.
You can use the example table customers
and payments
.
customers
Example Input:customer_id | first_name | last_name |
---|---|---|
101 | John | Doe |
102 | Jane | Smith |
103 | David | Brown |
104 | Samantha | Green |
105 | Michael | Porter |
payments
Example Input:payment_id | customer_id | payment_date | amount |
---|---|---|---|
1 | 101 | 2022-06-08 | 100.00 |
2 | 101 | 2022-08-01 | 200.00 |
3 | 102 | 2022-10-05 | 50.00 |
4 | 103 | 2022-07-20 | 75.00 |
5 | 104 | 2022-09-14 | 150.00 |
6 | 105 | 2022-01-25 | 500.00 |
The following PostgreSQL query represents the total amount of money spent by each customer in the past year:
SELECT c.customer_id, c.first_name, c.last_name, SUM(p.amount) AS total_amount_spent FROM customers c JOIN payments p ON c.customer_id = p.customer_id WHERE p.payment_date >= (CURRENT_DATE - INTERVAL '1 year') GROUP BY c.customer_id, c.first_name, c.last_name ORDER BY total_amount_spent DESC;
This query joins the customers
table with the payments
table on customer_id
. It then filters out payments that were made more than a year ago. It groups by customer_id
to provide a total for each individual customer and sorts the results in descending order of total amount spent, so the customers who spent the most will be at the top.
To practice a similar customer analytics SQL question where you can solve it interactively and have your SQL code automatically checked, try this Walmart SQL Interview Question:
Given a table of Brown & Brown employee salary data, write a SQL query to find employees who earn more money than their direct manager.
employees
Example Input:employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
Check your SQL query for this question directly within the browser on DataLemur:
First, we perform a SELF-JOIN where we treat the first employee
table (mgr
) as the managers' table and the second employee
table (emp
) as the employees' table. Then we use a WHERE
clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
SELECT emp.employee_id AS employee_id, emp.name AS employee_name FROM employee AS mgr INNER JOIN employee AS emp ON mgr.employee_id = emp.manager_id WHERE emp.salary > mgr.salary;
If the solution above is tough, you can find a detailed solution with hints here: Employees Earning More Than Their Boss.
A non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query. On the other hand, a correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query.
Here is an example of a non-correlated sub-query:
SELECT t1.customer_id, t1.total_sales FROM brown_brown_sales t1 WHERE t1.total_sales > ( SELECT AVG(t2.total_sales) FROM brown_brown_sales t2 );
The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.
Here is an example of a correlated sub-query:
SELECT t1.customer_id, t1.total_sales FROM brown_brown_sales t1 WHERE t1.total_sales > ( SELECT AVG(t2.total_sales) FROM brown_brown_sales t2 WHERE t2.customer_id = t1.customer_id );
This query selects the customer_id
and total sales of all Brown & Brown customers in the sales table whose total_sales
are greater than the average total_sales
of their own customer group. The sub-query in this case is correlated with the outer query, as it references the customer_id
column from the outer query (t1.customer_id
).
Non-correlated sub-queries are typically faster, as they only have to be executed once, wheareas correlated sub-queries are slower since they have to be re-evaluated for each row of the outer query.
Assume that Brown & Brown maintains a database of transactions made by its clients. Each transaction record contains the client id, transaction date, insurance product id, and the insurance premium (in USD). As an insurance analyst at Brown & Brown, you are tasked to calculate the rolling three-month average insurance premium for each product to understand how product performance varies over time.
Consider the following dataset transactions
:
transactions
Example Input:transaction_id | client_id | transaction_date | product_id | insurance_premium |
---|---|---|---|---|
001 | 123 | 06/08/2021 | A123 | 100 |
002 | 265 | 06/28/2021 | B234 | 200 |
003 | 362 | 08/18/2021 | A123 | 150 |
004 | 192 | 07/26/2021 | C345 | 250 |
005 | 981 | 09/05/2021 | A123 | 200 |
006 | 125 | 09/15/2021 | B234 | 300 |
Here is the solution postgreSQL query:
SELECT product_id, transaction_date, AVG(insurance_premium) OVER ( PARTITION BY product_id ORDER BY transaction_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) as rolling_avg FROM transactions ORDER BY product_id, transaction_date;
This query calculates the rolling three-month average insurance premium for each product_id. The window function is partitioned by the product_id and ordered by the transaction date. It uses the 'ROWS BETWEEN 2 PRECEDING AND CURRENT ROW' clause to define the three-month window, considering the current row and the two preceding rows. This calculation assumes that the transaction records are monthly. The output will provide the trends of insurance premiums for each product over time, which could be used for product performance analysis and further decision making.
To solve a similar window function question on DataLemur's free online SQL coding environment, solve this Google SQL Interview Question:
Joins in SQL allow you to combine data from different tables based on a shared key or set of keys.
Four JOIN types are available in SQL. For an example of each one, say you had sales data exported from Brown & Brown's Salesforce CRM stored in a PostgreSQL database, and had access to two tables: sales
and brown_brown_customers
.
INNER JOIN
: retrieves rows from both tables where there is a match in the shared key or keys. For example, an INNER JOIN between the Sales table and the Customers table would retrieve only the rows where the customer_id in the Sales table matches the customer_id in the brown_brown_customers
table.LEFT JOIN
: retrieves all rows from the left table (in this case, the sales table) and any matching rows from the right table (the brown_brown_customers
table). If there is no match in the right table, NULL values will be returned for the right table's columns.RIGHT JOIN
: retrieves all rows from the right table (in this case, the customers table) and any matching rows from the left table (the sales table). If there is no match in the left table, NULL values will be returned for the left table's columns.FULL OUTER JOIN
: retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.As an analyst at Brown & Brown, you are tasked with retrieving customer data from the company's database. The marketing department needs you to filter out customers who are not premium members and who have made less than three purchases in the last six months, so they can be targeted in an upcoming promotion. They also want you to exclude customers who have cancelled an order in the last month.
customers
Example Input:customer_id | name | is_premium |
---|---|---|
201 | John Doe | true |
264 | Jane Smith | false |
288 | Mary Brown | true |
101 | James Johnson | true |
orders
Example Input:order_id | customer_id | order_date | cancelled |
---|---|---|---|
6001 | 201 | 2022-07-15 | false |
6002 | 264 | 2022-06-01 | true |
6003 | 288 | 2022-06-15 | false |
6004 | 101 | 2022-07-01 | true |
6005 | 201 | 2022-06-10 | false |
6006 | 264 | 2022-06-20 | false |
6007 | 201 | 2022-06-05 | false |
6008 | 101 | 2022-05-01 | false |
In PostgreSQL, you can use subqueries to generate the count of orders and filter on this count using HAVING clause.
SELECT c.customer_id, c.name FROM customers c WHERE c.is_premium = true AND c.customer_id NOT IN ( SELECT o.customer_id FROM orders o WHERE o.order_date > current_date - interval '6 months' GROUP BY o.customer_id HAVING COUNT(o.order_id) < 3 OR MAX(o.cancelled) = true );
This SQL query filters out the customers who are not premium members and who have made less than three purchases in the last six months. It also excludes customers who have cancelled an order in the last month.
CHECK
constraint, and when might it be helpful to use it?The CHECK constraint is used to enforce rules on the data in a specific column. If a row is inserted or updated with data that does not follow the CHECK constraint's rule, the operation will fail. The CHECK constraint is often used with other constraints, such as NOT NULL or UNIQUE, to ensure that data meets certain conditions. You may want to use a CHECK constraint in your database if you want to maintain the quality and reliability of your data by ensuring that it meets specific criteria.
For example, you might use a CHECK constraint to ensure that a column contains only positive numbers, or that a date is within a certain range.
CREATE TABLE brown_brown_employees ( id INT PRIMARY KEY, salary INT CHECK (salary > 0), hire_date DATE CHECK (hire_date >= '1940-01-01') );
At Brown & Brown, a leading insurance brokerage firm, you have been assigned to find the average number of policies sold per agent. Each agent belongs to a specific region and the data for each policy sold includes the agent's ID, the region, the date of selling, and the policy number.
In the context of this question, we have two primary tables that are agents_table
and policies_table
.
agents_table
Example Input:agent_id | region |
---|---|
A1 | West |
A2 | East |
A3 | North |
A4 | East |
A5 | South |
policies_table
Example Input:policy_number | sold_date | agent_id |
---|---|---|
P1 | 06/08/2022 00:00:00 | A1 |
P2 | 06/10/2022 00:00:00 | A2 |
P3 | 07/05/2022 00:00:00 | A2 |
P4 | 07/26/2022 00:00:00 | A3 |
P5 | 06/08/2022 00:00:00 | A1 |
P6 | 06/10/2022 00:00:00 | A5 |
P7 | 06/12/2022 00:00:00 | A1 |
SELECT A.agent_id, AVG(P.policy_count) AS avg_policy_sold FROM agents_table A LEFT JOIN (SELECT agent_id, COUNT(policy_number) as policy_count FROM policies_table GROUP BY agent_id ) P ON A.agent_id = P.agent_id GROUP BY A.agent_id;
This PostgreSQL query works by grouping policies sold by each agent from the policies_table
in a subquery. It will return the number of policies (policy_count
) each agent (agent_id
) has sold.
Then, using a LEFT JOIN, it combines this data with the agents_table
.
Finally, it uses the AVG function to find the average number of policies sold per agent. The result is a new table with each agent's ID and the average number of policies they've sold.
To practice a very similar question try this interactive Twitter Histogram of Tweets Question which is similar for aggregating counts per user or this Stripe Repeated Payments Question which is similar for handling multiple transactions per entity.
Suppose Brown & Brown runs various digital marketing campaigns with online ads and maintains a record of how many times each ad was viewed and clicked. The goal is to calculate the Click-Through-Rate (CTR) for each campaign, where CTR is defined as the number of ad clicks divided by the number of ad views, multiplied by 100%.
Let's assume we have two tables named Ad_Views
and Ad_Clicks
with the following structure:
Ad_Views
Sample Input:campaign_id | ad_view_date | views |
---|---|---|
4081 | 2022-06-08 | 1200 |
7823 | 2022-06-10 | 950 |
4604 | 2022-06-18 | 1300 |
4081 | 2022-07-26 | 1100 |
7823 | 2022-07-05 | 1000 |
Ad_Clicks
Sample Input:campaign_id | ad_click_date | clicks |
---|---|---|
4081 | 2022-06-08 | 200 |
7823 | 2022-06-10 | 100 |
4604 | 2022-06-18 | 300 |
4081 | 2022-07-26 | 250 |
7823 | 2022-07-05 | 120 |
The click-through-rate (CTR) can be calculated using this formula: (total_clicks / total_views) * 100.
SELECT v.campaign_id, (SUM(c.clicks) / SUM(v.views)) * 100 as CTR FROM Ad_Views v JOIN Ad_Clicks c ON v.campaign_id = c.campaign_id AND v.ad_view_date = c.ad_click_date GROUP BY v.campaign_id;
This SQL query first joins the Ad_Views
table with the Ad_Clicks
table on campaign_id
and ad_view_date
. Then it groups the result by campaign_id
. For each campaign, it calculates the CTR by taking the sum of clicks, dividing it by the sum of views, and then multiplying the result by 100. This will give us the click-through-rate as a percentage for each marketing campaign of Brown & Brown.
To solve another question about calculating rates, solve this SQL interview question from TikTok within DataLemur's interactive coding environment:
Normalization is the process of dividing a larger table into smaller tables, to eliminate redundancy and dependency. Although there are 5 levels of normalization (normal forms), the 3 most important ones that you should know for the Brown & Brown SQL interview are:
At Brown & Brown, a leader in the insurance and reinsurance products, the team tracks sales data across different insurance products and regions. They would like to figure out which insurance product has been most sold per state in 2022.
Here is a sample dataset:
sales
Example Input:sale_id | product_id | sale_date | state | quantity |
---|---|---|---|---|
1050 | 101 | 03/07/2022 | CA | 5 |
1051 | 102 | 02/10/2022 | NY | 3 |
1052 | 101 | 01/18/2022 | CA | 7 |
1053 | 103 | 04/25/2022 | TX | 2 |
1054 | 102 | 03/15/2022 | CA | 1 |
1055 | 101 | 02/17/2022 | NY | 6 |
1056 | 103 | 02/26/2022 | TX | 4 |
The output should provide the state, the product_id of the most sold product per state, and the total quantity sold of that product in the state in 2022:
state | product_id | total_quantity |
---|---|---|
CA | 101 | 12 |
NY | 101 | 6 |
TX | 103 | 6 |
SELECT state, product_id, SUM(quantity) AS total_quantity FROM sales WHERE EXTRACT(YEAR FROM sale_date) = 2022 GROUP BY state, product_id HAVING SUM(quantity) = ( SELECT MAX(SUM(quantity)) FROM sales WHERE EXTRACT(YEAR FROM sale_date) = 2022 GROUP BY state, product_id ) ORDER BY state, total_quantity DESC;
This query first extracts sales data from the year 2022. It then groups the data by state and product_id and calculates the total quantity of sales for each group. The HAVING clause filters out the results, keeping only the rows where the total quantity equals the maximum total quantity per state and product. Finally, we sort the resulting dataset by state and total quantity in descending order.
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Brown & Brown SQL interview is to solve as many practice SQL interview questions as you can! Beyond just solving the above Brown & Brown SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Microsoft, Google, Amazon, and tech startups.
Each interview question has multiple hints, detailed solutions and most importantly, there is an online SQL code editor so you can instantly run your query and have it checked.
To prep for the Brown & Brown SQL interview you can also be a great idea to practice SQL questions from other insurance companies like:
Get ahead of the curve with Brown & Brown's expert analysis on the potential of AI to transform industries and economies!
In case your SQL skills are weak, don't worry about jumping right into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers topics including aggreage functions like MIN()/MAX() and math functions like CEIL()/FLOOR() – both of which pop up often in Brown & Brown SQL assessments.
Besides SQL interview questions, the other types of problems covered in the Brown & Brown Data Science Interview include:
To prepare for Brown & Brown Data Science interviews read the book Ace the Data Science Interview because it's got:
Don't ignore the behavioral interview – prepare for that using this Behavioral Interview Guide for Data Scientists.