At Robinhood, SQL is typically used for analyzing user trading patterns to find bad actors, and to generate product-insights as the company builds it's suite of finanical products. That's why Robinhood often tests SQL query questions in interviews for Data Science, Data Engineering and Data Analytics jobs.
So, to help you ace the Robinhood SQL interview, this blog covers 8 Robinhood SQL interview questions – how many can you solve?
This is the same question as problem #2 in the SQL Chapter of Ace the Data Science Interview!
Assume you're given the tables containing completed trade orders and user details in a Robinhood trading system.
Write a query to retrieve the top three cities that have the highest number of completed trade orders listed in descending order. Output the city name and the corresponding number of completed trade orders.
Column Name | Type |
---|---|
order_id | integer |
user_id | integer |
quantity | integer |
status | string ('Completed', 'Cancelled') |
date | timestamp |
price | decimal (5, 2) |
order_id | user_id | quantity | status | date | price |
---|---|---|---|---|---|
100101 | 111 | 10 | Cancelled | 08/17/2022 12:00:00 | 9.80 |
100102 | 111 | 10 | Completed | 08/17/2022 12:00:00 | 10.00 |
100259 | 148 | 35 | Completed | 08/25/2022 12:00:00 | 5.10 |
100264 | 148 | 40 | Completed | 08/26/2022 12:00:00 | 4.80 |
100305 | 300 | 15 | Completed | 09/05/2022 12:00:00 | 10.00 |
100400 | 178 | 32 | Completed | 09/17/2022 12:00:00 | 12.00 |
100565 | 265 | 2 | Completed | 09/27/2022 12:00:00 | 8.70 |
Column Name | Type |
---|---|
user_id | integer |
city | string |
string | |
signup_date | datetime |
user_id | city | signup_date | |
---|---|---|---|
111 | San Francisco | rrok10@gmail.com | 08/03/2021 12:00:00 |
148 | Boston | sailor9820@gmail.com | 08/20/2021 12:00:00 |
178 | San Francisco | harrypotterfan182@gmail.com | 01/05/2022 12:00:00 |
265 | Denver | shadower_@hotmail.com | 02/26/2022 12:00:00 |
300 | San Francisco | houstoncowboy1122@hotmail.com | 06/30/2022 12:00:00 |
city | total_orders |
---|---|
San Francisco | 3 |
Boston | 2 |
Denver | 1 |
To solve this question on DataLemur's free interactive SQL code editor, try this Robinhood SQL interview question:
Robinhood is interested in the volume of each stock in order to understand the trading behaviors of their users. They would like to calculate the daily percentage change of traded volume for their most traded stock, to see whether the stock trading volume is increasing or decreasing day by day.
Please write an SQL query that calculates the percentage change in volume for each day, where the percentage change is defined as . In this question, assume 'today' is defined as the 'trade_date' and 'yesterday' is defined as the 'trade_date' one day before 'today'.
trade_date | stock_symbol | volume |
---|---|---|
2022-07-01 | APPL | 1000000 |
2022-07-02 | APPL | 1500000 |
2022-07-03 | APPL | 1800000 |
2022-07-04 | APPL | 1750000 |
2022-07-05 | APPL | 1800000 |
trade_date | stock_symbol | volume_change_pct |
---|---|---|
2022-07-02 | APPL | 50.00 |
2022-07-03 | APPL | 20.00 |
2022-07-04 | APPL | -2.78 |
2022-07-05 | APPL | 2.86 |
To accomplish this, we can utilize the PostgreSQL window function , which retrieves a data point from a previous row.
This statement first calculates the volume difference between current and previous day, then divides it by the volume of the previous day and multiplies by 100 to get the percentage change. The clause ensures that the calculation is made correctly for each stock and in the proper date order.
For more window function practice, try this Uber SQL problem on DataLemur's interactive coding environment:
A database view is a virtual table that is created based on the results of a SELECT statement, and provides you a customized, read-only version of your data that you can query just like a regular table.
You'd want to use a view for a few reasons:
Robinhood is an investment platform that enables users to buy and sell stocks, ETFs, options, and cryptocurrencies. As a database designer of Robinhood, you are tasked with the design of a database table that records the trades made by users. Each row in the table represents a transaction record, with columns for , , , (buy or sell), , and .
Two other tables and store information about the users and the stocks respectively. The table has columns for , , and . The table has columns for , , and .
Design meaningful SQL queries that provide insights about user's activities and stock's performance.
Example Input:
transaction_id | user_id | stock_id | transaction_type | quantity | price | transaction_date |
---|---|---|---|---|---|---|
1 | 23 | 56 | buy | 50 | 120.50 | 2022-04-01 09:30:00 |
2 | 45 | 78 | sell | 20 | 145.20 | 2022-04-01 10:10:00 |
3 | 23 | 56 | sell | 10 | 122.30 | 2022-04-02 11:20:00 |
4 | 45 | 89 | buy | 100 | 95.10 | 2022-04-02 14:30:00 |
5 | 23 | 90 | buy | 30 | 85.0 | 2022-04-03 15:40:00 |
user_id | user_name | user_country | user_signupdate |
---|---|---|---|
23 | John Doe | USA | 2022-03-01 |
45 | Jane Smith | Canada | 2022-03-05 |
stock_id | stock_symbol | stock_name | stock_sector |
---|---|---|---|
56 | APPL | Apple Inc. | Technology |
78 | GOOG | Google LLC | Technology |
89 | MSFT | Microsoft Corp | Technology |
90 | AMZN | Amazon.com Inc. | Consumer Discretionary |
These queries will help Robinhood gain insights to understand user trading behavior and stock performance better.
Both window functions are used to find rows at a given offset from the current row. However, will give you the rows AFTER the current row you. On the other hand, will give you the rows BEFORE the current row.
As a financial service company, Robinhood typically deals with stocks data. Let's say you have two tables in your database: one for and one for they've made. The table records each trade with the traded stock, the user who made the trade, and the date it was made.
Please write a SQL query to find out which stock was the most traded each month.
user_id | name | sign_up_date | location |
---|---|---|---|
1234 | John | 01/02/2021 | New York |
4567 | Jane | 04/03/2021 | Chicago |
8910 | Doe | 07/05/2021 | Los Angeles |
trade_id | user_id | trade_date | stock |
---|---|---|---|
1 | 1234 | 06/08/2021 | AAPL |
2 | 4567 | 06/10/2021 | TSLA |
3 | 1234 | 06/18/2021 | AAPL |
4 | 8910 | 07/26/2021 | GOOG |
5 | 4567 | 07/05/2021 | AAPL |
For example, if the most traded stock in June was AAPL and in July was GOOG, the output should be
month | most_traded_stock |
---|---|
6 | AAPL |
7 | GOOG |
This query extracts the month from the trade date, groups the trades by month and stock, and then counts the number of trades for each stock in each month. To get the most traded stock for each month, the query orders the result by the count in descending order and finally selects the top row. This would give the most traded stock for the entire data, to get the most traded stock for each month, we would need to use a window function which is a bit more advanced.
In database schema design, a one-to-one relationship is when each entity is associated with only one instance of the other. For instance, a US citizen's relationship with their social-security number (SSN) is one-to-one because each citizen can only have one SSN, and each SSN belongs to one person.
A one-to-many relationship, on the other hand, is when one entity can be associated with multiple instances of the other entity. An example of this is the relationship between a person and their email addresses - one person can have multiple email addresses, but each email address only belongs to one person.
Robinhood, as an online trading platform, wants to look into customer trading trends specifically, they want to find all customers whose email addresses contain a specific pattern, let's say 'yahoo' in them from the database table.
You are tasked to write a SQL command that can filter through the customer's database to meet this requirement. To better visualize this, let's consider the dataset below,
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@yahoo.com |
2 | Mike | Smith | mike.smith@gmail.com |
3 | Sarah | Brown | sarahbrown@yahoo.com |
4 | James | Johnson | jamesjohnson@hotmail.com |
5 | Linda | Williams | linda_w@yahoo.com |
The output of your command should look something like this:
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@yahoo.com |
3 | Sarah | Brown | sarahbrown@yahoo.com |
5 | Linda | Williams | linda_w@yahoo.com |
Using the LIKE keyword in PostgreSQL with the proper wildcard characters (%), one can find any data that matches a specific pattern. In this case, we are looking for email addresses that contain 'yahoo' anywhere in the string. The SQL above will fetch the desired records from the table.
The key to acing a Robinhood SQL interview is to practice, practice, and then practice some more! Beyond just solving the above Robinhood SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Google, Facebook, Microsoft and Amazon.
Each interview question has hints to guide you, fully explained answers along with a discussion board to see how others solved it and most importantly, there is an interactive coding environment so you can right online code up your SQL query answer and have it checked.
To prep for the Robinhood SQL interview you can also be helpful to solve SQL questions from other tech companies like:
But if your SQL foundations are weak, forget about going right into solving questions – improve your SQL foundations with this free SQL tutorial.
This tutorial covers things like sorting results with ORDER BY and math functions like CEIL()/FLOOR() – both of which show up often during Robinhood interviews.
In addition to SQL interview questions, the other types of problems covered in the Robinhood Data Science Interview are:
The best way to prepare for Robinhood Data Science interviews is by reading Ace the Data Science Interview. The book's got: