# 10 CME Group SQL Interview Questions (Updated 2024)

Updated on

June 30, 2024

At CME Group, SQL is typically used for analyzing trading volume patterns and predicting market trends based trade-related data. Because of this, CME Group asks SQL problems in interviews for Data Science and Data Engineering positions.

Thus, to help you practice, we've curated 10 CME Group SQL interview questions – how many can you solve?

## 10 CME Group SQL Interview Questions

### SQL Question 1: Identify the Top Trading Clients

CME Group is a global markets company providing a wide range of trading services and operates the largest options and futures exchange in the world. They have a vast number of trading clients, some of whom trade more frequently and in larger quantities than others. For this SQL question, write a query to identify the top trading clients in the last month. These are considered as the 'whale' users for CME Group.

Assume you have access to a 'trades' table that records every trade made by all clients. Find the clients who have made the most number of trades and have the highest total trading volume.

##### Example Input:
trade_idclient_idtrade_datetrade_volume
101289006/23/2022 00:00:0050
024556706/30/2022 00:00:00150
341289007/01/2022 00:00:00100
765312307/10/2022 00:00:00200
980156707/20/2022 00:00:00350

#### Answer:

This query groups the trades made by each client in the last month, counts the number of trades (trade_frequency), calculates the total trading volume for each client, and sorts the clients based on the two measures. The clients with the highest number of trades and the highest total trading volume are considered 'whale' users. It then outputs the top 10 whale users.

To practice a super-customer analysis question on DataLemur's free online SQL coding environment, try this Microsoft SQL Interview problem:

### SQL Question 2: 2nd Highest Salary

Suppose there was a table of CME Group employee salary data. Write a SQL query to find the 2nd highest salary at the company.

#### CME Group Example Input:

employee_idsalary
12500
2800
31000
41200

#### Example Output:

second_highest_salary
1200

Test your SQL query for this interview question and run your code right in the browser:

#### Answer:

You can find a step-by-step solution here: 2nd Highest Salary.

### SQL Question 3: Have you ever had to optimize a slow SQL query? How did you do it?

First things first, figure out why the query is slow! You can use and commands in PostgreSQL to identify any performance bottlenecks. You might discover that your query is inefficient, or that there are many database writes at the same time you are doing a read, or maybe too many people are concurrently running queries on the same database server.

For Data Analyst and Data Science positions, knowing the ins-and-outs of SQL performance tuning is out-of-scope for the SQL interview round. However, knowing that joins are expensive, and indexes can speed up queries, is generally enough of an answer for CME Group SQL interviews.

### SQL Question 4: Calculate average trading volume per day

CME Group is a global markets company and for them, understanding their daily trading activity can be instrumental when it comes to reporting and decision-making.

Given a table which tracks each trade placed on the CME Group platform, write a SQL query that will calculate the average trading volume per day of each product for the month of January 2023.

The table has the following columns:

• (integer) - unique ID for each trade
• (integer) - unique ID for each product
• (integer) - recorded volume for each trade
• (timestamp) - the date and time the trade took place

Note on the "average trading volume per day": it is defined as the of for each distinct day for a product divided by the number of distinct trading days.

##### Example Input:
trade_idproduct_idtrade_volumetrade_date
86527508730001/05/2023 10:30:00
98465897450001/05/2023 14:20:00
54698508740001/06/2023 11:25:00
37892897460001/06/2023 15:10:00
60503508720001/07/2023 09:45:00
84659897470001/07/2023 13:05:00
##### Example Output:
product_idaverage_trade_vol_per_day
5087300.00
8974600.00

#### Answer:

The inner query picks all trades in the month of January 2023, groups them by product and date (ignoring time), and sums up the trade volumes for each of these groups. The outer query then averages these sums over the unique trading days to get the average trading volume per day for each product.

Pro Tip: Window functions are a frequent SQL interview topic, so practice all the window function problems on DataLemur

### SQL Question 5: How does the RANK() window function differ from DENSE_RANK()?

While both and are used to rank rows, the key difference is in how they deal with ties.

RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the 2nd row in the tie, and a rank of 4 to the the 3rd tie.

DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.

Suppose we had data on how many deals different salespeople at CME Group closed, and wanted to rank the salespeople.

The result of this query would be:

namedeals_closedrankdense_rank
Akash5011
Brittany5021
Carlos4032
Dave4043
Eve3053
Frank2064

As you can see, the function assigns a rank of 1 to the first row (Akash), a rank of 2 to the second row (Brittany), and so on. In contrast, the , gives both Akash and Brittany a rank of 1, since they are tied, then moves on to rank 2 for Carlos.

### SQL Question 6: Trade Volume Analysis

CME Group is a financial organization dealing with derivative instruments, including futures. These are traded on a daily basis with varying volumes and values. Your goal is to propose a database model for storing and query this daily trading data. Specifically, you should be able to answer questions like "Which futures contract had the highest traded volume on a specific date?".

In order to solve this problem, we propose the following database tables:

##### Example Input:

Then, for the above SQL questions, you would write a PostgreSQL query to solve it which would look something like below.

#### Answer:

This query first calculates the maximum traded volume for each day per contract. Then it joins with the contracts table to fetch the contract names. Finally, it filters by the date to get the contract with the maximum traded volume for a specific date.

### SQL Question 7: What are database views, and when would you use them?

Views are a lot like virtual tables, where you can take a base table and customize it (such as by hiding some data from non-admin users, or removing some random columns/rows based on business requirements).

Here's the PostgreSQL syntax for creating a view based on data in the table:

### SQL Question 8: Filter Customer Activities in Financial Market

CME Group is one of the world's leading and most diverse derivatives marketplaces, handling a majority of futures trading and options trading across the globe. Suppose we have a customer transactions table. The columns are as follows:

• transaction_id: a unique identifier for the transaction.
• customer_id: the identifier of the customer who made the transaction.
• transaction_date: the date when the transaction was made.
• product_id: the unique identifier of the product involved in the transaction.
• transaction_type: a character representing the type of the transaction, 'B' for Buy and 'S' for Sell.
• price: the price of the transaction.

Write an SQL query to fetch all the transactions made by customer with customer_id = 567 in the year 2021, where the transaction type was 'Buy', and the transaction price was higher than 50000.

##### Example Input:
transaction_idcustomer_idtransaction_dateproduct_idtransaction_typeprice
12345672021-02-1455555'B'70000
23458882021-05-1144444'S'50000
34565672021-06-2255555'B'60000
45675672020-10-3144444'S'80000
56789992021-12-1433333'B'100000

#### Answer:

This query first filters the transactions which were made by the customer with id 567. Then it adds a condition to only include transactions that happened in the year 2021. Next, it checks that the transaction type was 'Buy'. Finally, it filters the transactions where the price was higher than 50000.

### SQL Question 9: Analyzing Trading Activities

Given the table which records every trade occurrence in CME Group, Write an SQL query to find out the average trading price of each product category in each month. Additionally, Aggregate all trades by product category and month, and calculate the average trade price during that period.

##### Example Input:
trade_idproduct_categorytrading_datetrader_idtrading_price
2001Futures2022-06-011001220.40
3342Options2022-06-031202340.10
4003Futures2022-06-291053230.20
5678Futures2022-07-111077225.50
6011Options2022-07-151061350.20
##### Example Output:
mthproduct_categoryavg_trading_price
6Futures225.30
6Options340.10
7Futures225.50
7Options350.20

#### Answer:

This query sorts all trades into groups based on the month of the trade and the product category. For each of these groups, the query will return the average trade price by using the function. The statement extracts the month from the and names it for further use.

### SQL Question 10: When it comes to database normalization, what's the difference between 1NF, 2NF, and 3NF?

There are several normal forms that define the rules for normalizing a database:

A database is in first normal form (1NF) if it meets the following criteria:

• Each column in a table contains a single value (no lists or containers of data)
• Each column should contain the same type of data (no mixing strings vs. integers)
• Each row in the table is unique

A database is in second normal form (2NF) if it meets the following criteria:

• It is in first normal form.
• All non-key attributes in a table are fully dependent on the primary key.

Said another way, to achieve 2NF, besides following all the rules from 1NF all the columns in a given table should be dependent only on that table's primary key.

A database is in third normal form (3NF) if it meets the following criteria:

• It is in second normal form.
• There are no transitive dependencies in the table.

A transitive dependency means that a piece of data in one column is derived from another column. For example, it wouldn't make sense to keep a column called "user's age" and "user's birthdate" (because age can be derived from birthdate.

While there's also a 4th and 5th normal form, it's too pedantic and hence out-of-scope to know for the CME Group SQL interview.

Also check out the CME careers page!

### CME Group SQL Interview Tips

The best way to prepare for a CME Group SQL interview is to practice, practice, practice. Besides solving the earlier CME Group SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Google, Uber, and Microsoft.

Each interview question has multiple hints, detailed solutions and best of all, there's an online SQL code editor so you can instantly run your SQL query answer and have it executed.

To prep for the CME Group SQL interview it is also useful to practice SQL questions from other stock exchange & brokerage companies like:

However, if your SQL skills are weak, forget about diving straight into solving questions – refresh your SQL knowledge with this interactive SQL tutorial.

This tutorial covers things like Subqueries and UNION vs. joins – both of which show up frequently in CME Group SQL interviews.

### CME Group Data Science Interview Tips

#### What Do CME Group Data Science Interviews Cover?

In addition to SQL interview questions, the other topics covered in the CME Group Data Science Interview are:

#### How To Prepare for CME Group Data Science Interviews?

To prepare for CME Group Data Science interviews read the book Ace the Data Science Interview because it's got:

• 201 interview questions sourced from FAANG tech companies
• a crash course on Product Analytics, SQL & ML
• over 1000+ reviews on Amazon & 4.5-star rating