logo

10 Goldman Sachs SQL Interview Questions (Updated 2024)

Updated on

June 30, 2024

Data Analysts and Data Engineers at Goldman Sachs write SQL queries all the time as part of their job. They use it for extracting and analyzing financial data for reporting needs, and managing relational databases. That's why Goldman Sachs asks prospective hires SQL interview questions.

So, to help you ace the Goldman Sachs SQL interview, we'll cover 10 Goldman Sachs SQL interview questions – scroll down to start solving them!

Goldman Sachs SQL Interview Questions

10 Goldman Sachs SQL Interview Questions

Sure. Here's is an example of a SQL window function question that could possibly be used in a Goldman Sachs interview, this question is based on analysing sales data:

SQL Question 1: Calculating Running Total of Sales

Goldman Sachs has various products and they want to analyse their sales of each product. They asked you to write a SQL query that calculates the running total of sales for each product, ordered by the month of the year.

Here's some sample data:

Example Input:
sale_idproduct_idsale_datesale_amount
110001/01/20225000
220001/15/20223000
310002/01/20227000
430002/15/20221000
520003/01/20226000
630003/15/20224000
710004/01/20228000
820004/15/20229000
Example Output:
monthproductrunning_total
11005000
12003000
210012000
23001000
32009000
33005000
410020000
420018000

Answer:

Here is a PostgreSQL query:


This query uses the window function to calculate the running total of sales for each product. It partitions the data by product_id, and then order the data by month. It then sums up all the product's sales till the current row, inclusive. The is used to get the month part of the sale_date. The ORDER BY clause at the end is simply to order the output in a readable manner.

p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur

DataLemur SQL Questions

SQL Question 2: Department vs. Company Salary

Assume you had a table of Goldman Sachs employee salaries, along with which department they belonged to. Write a query to compare the average salary of employees in each department to the company's average salary. Return the comparison result as 'higher', 'lower', or 'same' for each department. Display the department ID, payment month (in MM-YYYY format), and the comparison.

Code your solution to this question interactively on DataLemur:

Department vs. Company Salary

The solution is LONG – 30+ lines of SQL. You can find a detailed solution here: Department Salaries.

Make sure you study up on everything Goldman Sachs before the interview, read about the Goldman Sachs products and services!

SQL Question 3: What is denormalization?

Denormalization is the process of modifying a database schema in a way that deviates from the typical rules of normalization (1NF, 2NF, 3NF, etc.).

Denormalization is often used to improve the performance of a database, particularly when it is being used for reporting and analytical purposes (rather than in an Online Transaction Processing (OLTP) manager).

By duplicating data, denormalization can reduce the number of expensive joins required to retrieve data, which can improve query performance. However, denormalization can also cause problems such as increased data redundancy and the need for more complex update and delete operations.

Goldman Sachs SQL Interview Questions

SQL Question 4: Customer Transaction Analysis

Goldman Sachs is interested in analyzing data regarding client transactions in order to uncover insights for their investment strategies. Specifically, they want to track the total investment amount in each asset (stocks, bonds, derivatives), from each client per month.

They have a table named with columns , , , , and .

Design a database schema suitable for this problem and write a PostgreSQL query to provide a summary of the total invested amount in each asset type, by each client, per month.

Example Input:
transaction_idclient_idtransaction_dateasset_idasset_typeamount_invested
10015002/03/202122001Stocks20000
10025002/10/202130005Bonds50000
10038003/20/202122001Stocks60000
10045004/11/202145001Derivatives30000
Example Output:
monthclient_idasset_typetotal_invested
250Stocks20000
250Bonds50000
380Stocks60000
450Derivatives30000

Answer:


This query groups the transaction data by month, client id, and asset type. It then uses the aggregate function SUM to calculate the total amount invested for each group.

SQL Question 5: What's the difference between a left and right join?

In SQL, both a left and right join is used to combine rows from different tables based on a shared key or set of keys, resulting in a single merged table of data. However, here's the difference:

LEFT (OUTER) JOIN: Retrieves all the records/rows from the left and the matched records/rows from the right table.


RIGHT (OUTER) JOIN: Retrieves all the records/rows from the right and the matched records/rows from the left table.


SQL Question 6: Filtering Customer Transactions by Multiple Conditions

As an analyst in Goldman Sachs, you are asked to filter the customer transactions database to find out the transactions that shaped the company’s fortunes in the last fiscal quarter.

The conditions to filter the transactions are:

  1. The transaction amount should be more than $10,000.
  2. The transactions should have been done by customers who joined after January 1, 2021.
  3. The transactions must be successful ones.

Using the two tables and , write an SQL query that will filter these transactions.

Example Input:
customer_idjoin_date
102/02/2021
212/10/2020
307/15/2021
411/25/2020
508/19/2021
Example Input:
transaction_idcustomer_idtransaction_datetransaction_valuetransaction_status
101109/15/202115000Success
102209/10/20219999Success
103309/18/202120000Failed
104409/15/202125000Success
105509/20/202130000Success
Example Output:
transaction_idcustomer_idtransaction_datetransaction_value
101109/15/202115000
105509/20/202130000

Answer:


In the PostgreSQL query above, we are joining the table with the table on the . Our WHERE clause filters the transactions by: those greater than $10,000, made by customers who joined after January 1, 2021, and are successful. The resulting table consists of transaction id, customer id, transaction date, and transaction value in accordance with the conditions specified.

SQL Question 7: Can you explain the difference between and ?

The clause works similarly to the clause, but it is used to filter the groups of rows created by the clause rather than the rows of the table themselves.

For example, say you were analyzing Goldman Sachs sales data:


This query retrieves the total sales for all products in each region, and uses the clause to only sales made after January 1, 2023. The rows are then grouped by region and the clause filters the groups to include only those with total sales greater than $500k.

SQL Question 8: Click-Through-Rate for Goldman Sachs' Online Seminars

Regularly, Goldman Sachs hosts various online seminars for their prospective clients. The marketing team sends email invitations to a list of potential attendees. It wants to know the click-through rate (CTR) of these invitations. The CTR is calculated as the number of clicks on the invitation's link divided by the number of emails sent out, expressed as a percentage.

Consider two tables; which contains information about each email sent, and that holds data about each click on the invitation link.

Example Input:
email_idrecipientemail_dateseminar_id
5011a.john@mail.com06/08/20222001
5242b.smith@mail.com06/10/20222001
5933c.williams@mail.com06/18/20223002
6412d.brown@mail.com07/26/20223002
7890e.taylore@mail.com07/05/20224003
Example Input:
click_idemail_id
80115011
98225242
96336412

Answer:


This query joins the and tables based on the email id. For each seminar, it counts the number of clicks and the number of emails sent. It then calculates the click-through rate by dividing the number of clicks by the number of emails sent. The result is the click-through rate for each seminar.

To solve another question about calculating rates, try this SQL interview question from TikTok within DataLemur's online SQL coding environment: TikTok SQL question

SQL Question 9: Average Trade Amount for each Client Segment

As a data analyst at Goldman Sachs, one important task could be to understand the trading behavior of different segments of clients. Certain segments may trade more heavily than others, and identifying these trends can provide valuable business insights.

Suppose you have access to a table called which records every trade executed by a client. Each trade has a , , , ('Retail', 'Corporate', 'Institutional') and (in USD). Using this data, write a SQL query to calculate the average amount of trades for each client segment for every month.

Example Input:
trade_idclient_idtrade_dateclient_segmenttrade_amount
5500215702/16/2022Retail10500
8905281502/28/2022Institutional30000
9202462102/10/2022Corporate20000
7458132803/01/2022Retail25000
5503341503/20/2022Institutional15000

Answer:


This SQL query groups trades by both the client segment and month, while calculating the average trade amount for each group. The data is ordered by month in descending order and within each month, by the average trade amount in descending order. This allows us to see the more recent data first, and within each month, the client segments that trade the most on average.

SQL Question 10: What is the purpose of the SQL constraint ?

{#Question-10}

The UNIQUE constraint is used to ensure the uniqueness of the data in a column or set of columns in a table. It prevents the insertion of duplicate values in the specified column or columns and helps to ensure the integrity and reliability of the data in the database.

For example, say you were on the Marketing Analytics team at Goldman Sachs and were doing some automated keyword research:

Your keyword database might store SEO data like this:


In this example, the UNIQUE constraint is applied to the "keyword" field to ensure that each keyword is unique. This helps to ensure the integrity of the data in the database and prevents errors that could occur if two rows had the same keyword.

Goldman Sachs SQL Interview Tips

The key to acing a Goldman Sachs SQL interview is to practice, practice, and then practice some more! Beyond just solving the above Goldman Sachs SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Microsoft, Google, and Facebook. DataLemur SQL and Data Science Interview Questions

Each interview question has hints to guide you, step-by-step solutions and crucially, there is an online SQL code editor so you can right in the browser run your SQL query answer and have it checked.

To prep for the Goldman Sachs SQL interview it is also wise to practice SQL questions from other stock exchange & brokerage companies like:

In case your SQL foundations are weak, forget about going right into solving questions – go learn SQL with this interactive SQL tutorial.

Free SQL tutorial

This tutorial covers topics including aggregate window functions and joining a table to itself – both of which show up routinely during Goldman Sachs SQL interviews.

Goldman Sachs Data Science Interview Tips

What Do Goldman Sachs Data Science Interviews Cover?

In addition to SQL interview questions, the other types of questions to prepare for the Goldman Sachs Data Science Interview are:

  • Statistics and Probability Questions
  • Python or R Coding Questions
  • Product Data Science Interview Questions
  • ML Modelling Questions
  • Resume-Based Behavioral Questions

Goldman Sachs Data Scientist

How To Prepare for Goldman Sachs Data Science Interviews?

To prepare for Goldman Sachs Data Science interviews read the book Ace the Data Science Interview because it's got:

  • 201 interview questions sourced from FAANG (FB, Apple, Amazon, Netflix, Google)
  • a refresher covering Product Analytics, SQL & ML
  • over 1000+ 5-star reviews on Amazon

Acing Data Science Interview