Data Analysts and Data Engineers at Ernst & Young write SQL queries to analyze financial data, such as revenue growth and expense ratios, for auditing purposes. It is also used to manage databases that track client information, including contact details and engagement history, to ensure accurate and up-to-date records in the consulting division, which is why Ernst & Young includes SQL questions during job interviews.
Thus, to help you practice, we've collected 10 Ernst & Young SQL interview questions – how many can you solve?
Ernst & Young (EY) is a multinational professional services network. It primarily provides assurance (including financial audit), tax, consulting, and advisory services to its clients. For EY, a VIP user or "whale" would be a client that has substantial expenditures with EY in terms of billing amounts. You are given a database that contains two tables: a "clients" table and a "billings" table.
Please write a SQL query to identify EY's top 10 clients in terms of total billed amount in the past year.
client_id | client_name | industry |
---|---|---|
100 | ABC Corp | Technology |
101 | DEF Inc | Healthcare |
102 | GHI LLC | Finance |
bill_id | client_id | billing_date | amount |
---|---|---|---|
200 | 100 | 2022-06-22 | 50000.00 |
201 | 101 | 2022-07-13 | 75000.00 |
202 | 102 | 2022-05-08 | 36000.00 |
203 | 100 | 2022-11-15 | 25000.00 |
204 | 102 | 2022-09-05 | 40000.00 |
In this query, we perform an inner join between the and tables on the column, which is common to both tables. We only take into account the billings done in the past one year. The function is used to compute the total billed amount per client in this time frame. The results are then ordered in descending order of the amount and limited to the top 10, i.e., the clients with the highest total billed amount.
To solve a super-customer analysis question on DataLemur's free interactive coding environment, try this Microsoft SQL Interview problem:
Given a table of Ernst & Young employee salaries, write a SQL query to find the 2nd highest salary at the company.
employee_id | salary |
---|---|
1 | 2500 |
2 | 800 |
3 | 1000 |
4 | 1200 |
second_highest_salary |
---|
1200 |
Check your SQL query for this question and run your code right in the browser:
You can find a detailed solution with hints here: 2nd Highest Salary.
Denormalization is the practice of altering a database schema in a way that breaks the normalization rules (1st, 2nd, 3rd normal forms).
Here's 3 reasons to de-normalize a database at Ernst & Young:
Improved performance: Denormalization can often improve performance by reducing the number of expensive join operations required to retrieve data. This is particularly useful when the database is being used for online analytical processing (OLAP) as frequent joins across multiple tables can be slow and costly.
Scalability: Denormalization can increase the scalability of a database by requiring less data to be read and processed when executing a query. This can be beneficial when the database is expected to handle a high volume of queries.
Simplification: One way to simplify the design of a database is by using denormalization to reduce the number of tables and relationships that need to be managed. This can make the database easier to understand and maintain.
Ernst & Young heavily utilizes financial data, specifically analyzing revenue over time in specific sectors. For this hypothetical problem, we have the and tables. contains revenue data over time and describes which company belongs to which sector.
The task is to write a SQL query that calculates the average monthly revenue from each sector in 2020.
transaction_id | company_id | transaction_date | revenue |
---|---|---|---|
101 | 01 | 2020-01-15 | 5000 |
102 | 02 | 2020-01-20 | 8500 |
103 | 01 | 2020-02-10 | 4500 |
104 | 03 | 2020-02-20 | 9900 |
105 | 02 | 2020-02-25 | 7500 |
company_id | sector |
---|---|
01 | Technology |
02 | Healthcare |
03 | Technology |
month | sector | avg_revenue |
---|---|---|
1 | Technology | 5000 |
1 | Healthcare | 8500 |
2 | Technology | 7450 |
2 | Healthcare | 7500 |
The following SQL code block provides the solution to the problem.
This query uses a window function to initially calculate the average revenue per month for each company. This temporary result is then used to calculate the average monthly revenue for each sector, grouping by both the month and the sector. The function helps to work with date and time values by extracting the needed part. In this case, it extracts the month and the year from the .
p.s. Window functions show up super frequently during SQL interviews, so practice the 27+ window function questions on DataLemur
An inner join returns only the rows that match the join condition between the two tables, whereas a full outer join returns all rows from both tables, including any unmatched rows.
Ernst & Young (EY) is a multinational professional services firm and is one of the "Big Four" accounting firms. EY is involved with a number of consulting projects with different clients and the stakeholder wants to understand the consulting project performance. You are provided with the following two tables:
- Each row records information of a project involving a specific client.
project_id | client_id | project_name | project_start_date | project_end_date |
---|---|---|---|---|
1 | 101 | "Project A" | 2019-01-01 | 2019-06-01 |
2 | 102 | "Project B" | 2019-05-01 | 2019-11-01 |
3 | 103 | "Project C" | 2019-10-01 | 2020-03-01 |
- Each row records information of billing to the clients for each project.
billing_id | project_id | billing_date | billing_amount |
---|---|---|---|
1 | 1 | 2019-02-01 | 5000 |
2 | 1 | 2019-05-01 | 5000 |
3 | 2 | 2019-06-01 | 4000 |
4 | 2 | 2019-08-01 | 4000 |
5 | 3 | 2019-12-01 | 6000 |
The stakeholder would like to know:
Write a SQL PostgreSQL query to help answer the stakeholder's questions.
From the above query, the total billing amounts and the average monthly billing amounts for each project are calculated and presented. The average monthly billing amount is derived by dividing the total project billing amount by the project's length in months. This calculation takes into account the possible change in years by utilizing the function. The result will provide the stakeholder with key financial measures related to each project.
Cross joins and natural joins are two types of JOIN operations in SQL that are used to combine data from multiple tables. A cross join creates a new table by combining each row from the first table with every row from the second table, and is also known as a cartesian join. On the other hand, a natural join combines rows from two or more tables based on their common columns, forming a new table. One key difference between these types of JOINs is that cross joins do not require common columns between the tables being joined, while natural joins do.
Here's an example of a cross join:
If you have 20 products and 10 colors, that's 200 rows right there!
Here's a natural join example using two tables, Ernst & Young employees and Ernst & Young managers:
This natural join returns all rows from Ernst & Young employees where there is no matching row in managers based on the column.
Ernst & Young keeps records of all its customers and their interactions with various services. You are tasked to filter out the customers based on the following conditions:
customer_id | name | country | age | service_used |
---|---|---|---|---|
1201 | Justin | USA | 35 | Tax |
1202 | Emma | Australia | 31 | Audit |
1203 | Sophia | Canada | 45 | Audit |
1204 | Oliver | USA | 34 | Audit |
1205 | Ava | Germany | 36 | Advisory |
customer_id | service_id | active |
---|---|---|
1201 | 101 | false |
1202 | 102 | true |
1203 | 102 | true |
1204 | 102 | false |
1205 | 103 | true |
You need to write a SQL query which can extract the needed information.
This query first joins the customer records and service records based on the customer id. Then it applies the specified filters using WHERE clause conditions. The final result will display the customers who are not from USA, are over 30, and have actively used the Audit service.
Ernst & Young offers free webinars on various business topics. For marketing purposes, they display ad banners on various websites. The data team is interested in investigating how many people actually click on the ad and then register for a webinar. This analysis can help provide insights into the effectiveness of the ad banners. They specifically want to:
To achieve this, they keep track of ad impressions, ad clicks and webinar registrations in three different tables: , and , which are defined as follows:
impression_id | webinar_id | date | user_id |
---|---|---|---|
5621 | 1011 | 10/01/2022 00:00:00 | 124 |
5622 | 1920 | 10/01/2022 00:00:00 | 278 |
5623 | 1011 | 10/01/2022 00:00:00 | 345 |
5624 | 1011 | 10/01/2022 00:00:00 | 234 |
5625 | 1920 | 10/01/2022 00:00:00 | 678 |
click_id | webinar_id | date | user_id |
---|---|---|---|
3952 | 1011 | 10/01/2022 00:00:00 | 124 |
3953 | 1920 | 10/01/2022 00:00:00 | 278 |
3954 | 1011 | 10/01/2022 00:00:00 | 345 |
registration_id | webinar_id | date | user_id |
---|---|---|---|
182 | 1011 | 10/01/2022 00:00:00 | 124 |
276 | 1920 | 10/01/2022 00:00:00 | 278 |
930 | 1011 | 10/02/2022 00:00:00 | 345 |
431 | 1011 | 10/02/2022 00:00:00 | 234 |
590 | 1920 | 10/02/2022 00:00:00 | 678 |
This query uses a CTE () to calculate the number of impressions, clicks, and registrations by webinar. We ensure that a click comes after an impression and a registration comes after a click by the conditions in the WHERE clause. Then, it calculates the CTR for each webinar and returns the results sorted by the CTR in a descending order.
To solve a related problem on DataLemur's free interactive SQL code editor, try this Facebook SQL Interview question:
Clustered and non-clustered indexes are both used to improve query performance, but they differ in how they are implemented.
A clustered index determines the physical order of the data rows in a table, while a non-clustered index does not. As a result, a table can have only one clustered index, but it can have multiple non-clustered indexes.
In terms of query efficiency & performance, a clustered index is generally faster for SELECT queries (reads) but updates (writes) to a clustered index are slower, as they require the data rows to be physically rearranged.
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the earlier Ernst & Young SQL interview questions, you should also solve the 200+ SQL Interview Questions on DataLemur which come from companies like Google, Microsoft and Silicon Valley startups.
Each exercise has hints to guide you, fully explained answers along with a discussion board to see how others solved it and crucially, there's an online SQL code editor so you can easily right in the browser your SQL query answer and have it checked.
To prep for the Ernst & Young SQL interview it is also wise to practice SQL questions from other accounting & consulting companies like:
Discover how Ernst & Young is harnessing the potential of Artificial Intelligence to drive business transformation!
However, if your SQL query skills are weak, don't worry about jumping right into solving questions – refresh your SQL knowledge with this free SQL tutorial.
This tutorial covers SQL concepts such as RANK vs. DENSE RANK and finding NULLs – both of which show up routinely in Ernst & Young interviews.
In addition to SQL interview questions, the other types of problems to prepare for the Ernst & Young Data Science Interview are:
I believe the best way to study for Ernst & Young Data Science interviews is to read the book Ace the Data Science Interview.
It solves 201 data interview questions taken from FAANG tech companies. The book's also got a crash course covering SQL, AB Testing & ML. And finally it's vouched for by the data community, which is why it's earned 1000+ reviews on Amazon and a 4.5-star rating.
While the book is more technical in nature, it's also crucial to prepare for the Ernst & Young behavioral interview. A good place to start is by understanding the company's culture and values.