8 PwC SQL Interview Questions (Updated 2024)

Updated on

August 11, 2024

Data Science, Data Engineering and Data Analytics employees at PwC use SQL to analyze large datasets of financial transactions and accounting records for identifying anomalies and trends. It is also used to create complex financial models that forecast revenue and expenses for consulting engagements, which is why PwC uses SQL questions during job interviews.

So, to help you practice for the PwC SQL interview, we've curated 8 PricewaterhouseCoopers SQL interview questions – can you solve them?

PwC SQL Interview Questions

8 PricewaterhouseCoopers SQL Interview Questions

SQL Question 1: Ranking Employees Department-wise Based on Salary

PwC is a global network of firms delivering assurance, tax and consulting services for your business. Let's consider a scenario where you have been given a dataset for PwC employees with their Department details, Id, Name, Salary etc.

The data schema for table is as follows:

employee_idfirst_namelast_namedepartmentsalary
101JohnDoeIT50000
102JaneSmithAccounting60000
103MaryJohnsonMarketing55000
104JamesBrownIT70000
105PatriciaJonesAccounting65000

The task is to write a PostgreSQL query that ranks these employees within their respective departments based on their salary, descending (i.e., the employee with the highest salary in a department should have a rank of 1).

Answer:

In PostgreSQL, you can use the RANK() function which is a part of window functions to solve this problem.


This PostgreSQL query uses the window function to return a unique rank for each row within the partition of resultset. The clause is used to divide the entire result set into partitions (in this case, 'departments'). For each partition, the rank is calculated and presented. The clause is used to arrange the salary column in descending format to ensure that the employee having highest salary gets the highest rank i.e., 1.

Please note that in case of a "tie" (same salary), the same rank is assigned to both tied entries, skipping the next rank. If you would prefer not to skip the next rank in case of ties, consider using the function instead.

p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur

SQL Interview Questions on DataLemur

SQL Question 2: Top Three Salaries

Given a table of PwC employee salary information, write a SQL query to find the top 3 highest paid employees in each department.

PwC Example Input:

employee_idnamesalarydepartment_id
1Emma Thompson38001
2Daniel Rodriguez22301
3Olivia Smith20001
4Noah Johnson68002
5Sophia Martinez17501
8William Davis68002
10James Anderson40001

Example Input:

department_iddepartment_name
1Data Analytics
2Data Science

Example Output:

department_namenamesalary
Data AnalyticsJames Anderson4000
Data AnalyticsEmma Thompson3800
Data AnalyticsDaniel Rodriguez2230
Data ScienceNoah Johnson6800
Data ScienceWilliam Davis6800

Check your SQL query for this question and run your code right in DataLemur's online SQL environment:

Top 3 Department Salaries

Answer:

We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.


If the solution above is hard to understand, you can find a detailed solution here: Top 3 Department Salaries.

SQL Question 3: How can you determine which records in one table are not present in another?

To find records in one table that aren't in another, you can use a and check for values in the right-side table.

Here is an example using two tables, PwC employees and PwC managers:


This will return all rows from PwC employees where there is no matching row in managers based on the column.

You can also use the operator in PostgreSQL and Microsoft SQL Server to return the records that are in the first table but not in the second. Here is an example:


This will retrieve all rows from employees that do not appear in managers. The operator works by retreivingthe rows that are returned by the first query, but not by the second.

Please note that is not supported by all DBMS systems, such as MySQL and Oracle (however, you can use the operator to achieve a similar outcome).

PricewaterhouseCoopers SQL Interview Questions

SQL Question 4: Filtered Customer Data Analysis

As a part of ongoing efforts at PwC to understand customer behavior and improve services, you've been tasked to filter and analyze the customer data. Write a SQL query that filters down the database to only show records of customers who are from the "Information Technology" industry, are located in "New York" and have had a total revenue of over "50000" in the past year.

Example Input:
customer_idindustrylocationrevenue_last_year
5690"Information Technology""New York"60000
8472"Healthcare""Los Angeles"75000
4926"Information Technology""San Francisco"70000
7643"Information Technology""New York"50000
3819"Manufacturing""New York"80000
9584"Information Technology""New York"65000
Example Output:
customer_idindustrylocationrevenue_last_year
5690"Information Technology""New York"60000
9584"Information Technology""New York"65000

Answer:


This query uses the clause to filter the customers based on boolean conditions. It checks three conditions - being 'Information Technology', being 'New York' and being more than 50000. The operator is used to ensure that all three conditions must be true for a record to be included in the query result.

SQL Question 5: What are the different normal forms (NF)?

The three levels of database normalization (also known as normal forms) are:

First Normal Form (1NF):

  • Each column should contain the same type of data (no mixing strings vs. integers)
  • Each column in a table contains a single value (no lists or containers of data)
  • Each row in the table is unique

Second Normal Form (2NF)

  • It's in 1st Normal Form
  • All non-key attributes are dependent on the primary key

Said another way, to achieve 2NF, besides following the 1NF criteria, the columns should also be dependent only on that table's primary key.

Third normal form (3NF) if it meets the following criteria:

  • It's in 2nd Normal Form
  • There are no transitive dependencies in the table.

A transitive dependency means values in one column are derived from data in another column. For example, it wouldn't make sense to keep a column called "vehicle's age" and "vehicle's manufacture date" (because the vehicle's age can be derived from the date it was manufactured).

Although there is a 4NF and 5NF, you probably don't need to know it for the PwC interview.

SQL Question 6: Analyzing Clickthrough Rates for PwC's Digital Ads

PwC wants to assess the effectiveness of their digital marketing campaigns. They are particularly interested in the click-through-rates (CTR) of their advertisements. Each time an advertisement is displayed it is considered as an "impression". Every time that advertisement is clicked, it is recorded as a "click".

Calculate the CTR for each advertisement for the month of August, where CTR is defined as the total number of clicks divided by the total number of impressions, multiplied by 100 to get it in percentage.

Given the following tables:

Example Input:
impression_idad_idimpression_date
140108/01/2022 00:00:00
240208/01/2022 00:00:00
340108/02/2022 00:00:00
440308/02/2022 00:00:00
540208/03/2022 00:00:00
Example Input:
click_idad_idclick_date
140108/01/2022 00:00:00
240208/01/2022 00:00:00
340108/01/2022 00:00:00
440308/03/2022 00:00:00

Answer:


In this query, we join the two tables on the ad_id, making sure to also join on the date in case a click happens on a different day than the impression. We then group by the ad_id to get the total impressions and clicks for each ad. From this, we calculate the CTR by dividing the number of clicks by the number of impressions and multiplying by 100 to get the percentage.

To practice a similar SQL problem on DataLemur's free interactive coding environment, try this Meta SQL interview question:

Meta SQL interview question

SQL Question 7: What's the main difference between the 'BETWEEN' and 'IN' operators in SQL?

While both the and operators are used to filter data based on some criteria, selects for values within a given range, whereas for it checks if the value is in a given list of values.

For example, say you had a table called , which had the salary of the employee, along with the country in which they reside.

To find all employees who made between 80kand80k and 120k, you could use the operator:


To find all employees that reside in the US or Canada, you could use the operator:


SQL Question 8: Filter Customer Records based on Email Domain

Within PwC, they have multiple clients who have different email domains. They store this data within their customer database. They want to write a SQL query which allows them to filter the customer records on the basis of the email domain. The task is to write a PostgreSQL query that returns all customers whose email domain is "hotmail.com".

Example Input:

customer_idfirst_namelast_nameemail_address
1WilliamSmithwilliam_smith@gmail.com
2JoshuaBrownjoshua_brown@hotmail.com
3EmmaJonesemma.jones@yahoo.com
4AvaEvansava.evans@hotmail.com
5SophiaWrightsophia_wright@gmail.com

Expected Output:

customer_idfirst_namelast_nameemail_address
2JoshuaBrownjoshua_brown@hotmail.com
4AvaEvansava.evans@hotmail.com

Answer:


This query returns all the records from the table where the ends with "hotmail.com". The symbol in the clause is a wildcard that matches any sequence of characters, ensuring we select all email addresses that end with "hotmail.com", regardless of what comes before it.

PwC SQL Interview Tips

The key to acing a PwC SQL interview is to practice, practice, and then practice some more! In addition to solving the above PwC SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Google, Uber, and Microsoft.

DataLemur SQL Interview Questions

Each exercise has hints to guide you, full answers and best of all, there's an online SQL coding environment so you can instantly run your SQL query and have it checked.

To prep for the PwC SQL interview you can also be useful to solve SQL questions from other accounting & consulting companies like:

Discover how PwC is using Artificial Intelligence to help organizations solve complex problems and make better decisions!

But if your SQL skills are weak, don't worry about jumping right into solving questions – improve your SQL foundations with this free SQL for Data Analytics course.

Free SQL tutorial

This tutorial covers things like filtering on multiple conditions using AND/OR/NOT and SQL joins with practice exercises – both of which show up frequently during PwC interviews.

PricewaterhouseCoopers Data Science Interview Tips

What Do PwC Data Science Interviews Cover?

In addition to SQL query questions, the other types of problems to prepare for the PwC Data Science Interview include:

PwC Data Scientist

How To Prepare for PwC Data Science Interviews?

To prepare for PwC Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from companies like Microsoft, Google & Amazon
  • a refresher covering Product Analytics, SQL & ML
  • over 1000+ 5-star reviews on Amazon

Ace the Data Science Interview Book on Amazon

Also focus on the behavioral interview – prepare for it with this list of common Data Scientist behavioral interview questions.

© 2024 DataLemur, Inc

Career Resources

Free 9-Day Data Interview Crash CourseFree SQL Tutorial for Data AnalyticsSQL Interview Cheat Sheet PDFUltimate SQL Interview GuideAce the Data Job Hunt Video CourseAce the Data Science InterviewBest Books for Data Analysts