Data Analysts and Data Engineers at Discover Financial Services use SQL for analyzing transaction data and optimizing customer financial services. That's why Discover Financial Services often tests jobseekers SQL coding interview questions.
Thus, to help you prep, we've curated 9 Discover Financial Services SQL interview questions – able to answer them all?
At Discover Financial Services, it is important to identify frequent users of their credit cards. These frequent users, also referred to as VIP or whale users, are crucial for the sustained growth of the business. Creating a SQL query to identify these users would be beneficial for targeted marketing and customer care strategies, which in turn would increase our customer base.
Let's consider a transactions
table formatted as follows:
transactions
Example Input:transaction_id | user_id | transaction_date | card_id | amount |
---|---|---|---|---|
8001 | 200 | 06/10/2022 00:00:00 | 54500 | $40 |
9002 | 123 | 06/10/2022 00:00:00 | 54600 | $100 |
9012 | 200 | 06/11/2022 00:00:00 | 54500 | $150 |
8003 | 489 | 06/20/2022 00:00:00 | 54800 | $500 |
8050 | 123 | 07/10/2022 00:00:00 | 54600 | $40 |
To identify these power users, we can write a PostgreSQL query to check who has transacted more than a certain amount in the past month:
SELECT user_id, COUNT(*) as num_transactions, SUM(amount) as total_transaction_amount FROM transactions WHERE transaction_date BETWEEN NOW() - INTERVAL '1 month' AND NOW() GROUP BY user_id HAVING SUM(amount) > 5000;
This query will return the user_id
, number of transactions they have had in the past month (num_transactions
), and the total amount transacted (total_transaction_amount
) for users who have transacted over $5000 in the past month.
The threshold of $5000 is just an example and can be adjusted depending on the company's specific definition of a power user.
To practice a super-customer analysis question on DataLemur's free interactive SQL code editor, try this Microsoft SQL Interview problem:
You're given a table of Discover Financial Services employee and department salaries. Write a SQL query to find departments where the departments average salary is higher than the company's average salary.
You can solve this interview question directly within the browser on DataLemur:
The answer is LONG – 30+ lines of SQL. You can find a detailed solution with hints here: Department Salaries.
A correlated sub-query is one that relies on the outer query and cannot be run independently. It uses the outer query to filter or transform data by referencing a column from it, while the outer query uses the results of the inner query. On the other hand, a non-correlated sub-query is not linked to the outer query and can be run on its own. It does not reference any columns from the outer query and is used to retrieve additional data for the outer query.
An example correlated sub-query:
SELECT name, salary FROM discover_financial_services_employees e1 WHERE salary > (SELECT AVG(salary) FROM discover_financial_services_e2 WHERE e1.department = e2.department);
This correlated subquery retrieves the names and salaries of Discover Financial Services employees who make more than the average salary for their department. The subquery references the department column in the main query's FROM clause (e1.department) and uses it to filter the rows of the subquery's FROM clause (e2.department).
An example non-correlated sub-query:
SELECT name, salary FROM discover_financial_services_employees WHERE salary > (SELECT AVG(salary) FROM discover_financial_services_employees WHERE department = 'Data Analytics');
This non-correlated subquery retrieves the names and salaries of Discover Financial Services employees who make more than the average salary for the Data Analytics department (which honestly should be very few people since Data Analysts are awesome).
Anways, the subquery is independent of the main query and can be considered a standalone query. Its output (the average salary for the Data Analytics department) is then used in the main query to filter the rows of the discover_financial_services_employees
table.
As a Data Analyst for Discover Financial Services (DFS), one of your tasks is to analyze the transaction data for DFS's credit card customers and provide insights to steering marketing strategies.
For this SQL interview question, DFS is particularly interested in understanding the average transaction amount made by each customer on every month in the current year (2022).
Your task is write a SQL query that calculates the average monthly transaction amount per customer. For each month, the output should provide the customer_id
, month
and the average_transaction_amount
.
transaction
Example Input:transaction_id | customer_id | transaction_date | transaction_amount |
---|---|---|---|
1001 | 101 | 01/18/2022 | 450 |
1002 | 102 | 01/28/2022 | 875 |
1003 | 103 | 02/05/2022 | 525 |
1004 | 102 | 02/20/2022 | 700 |
1005 | 101 | 03/10/2022 | 965 |
1006 | 104 | 04/28/2022 | 310 |
1007 | 101 | 05/15/2022 | 1135 |
1008 | 101 | 06/20/2022 | 780 |
SELECT customer_id, EXTRACT(MONTH FROM transaction_date) AS month, AVG(transaction_amount) OVER (PARTITION BY customer_id, EXTRACT(MONTH FROM transaction_date)) AS average_transaction_amount FROM transaction WHERE EXTRACT(YEAR FROM transaction_date) = 2022 ORDER BY customer_id, month;
This query first filters only transactions happened in year 2022. For each record, it uses the SQL window function AVG() OVER()
with the partition by customer_id
and month
(extracted from the transaction_date
) to compute the average transaction amount per customer for each month. Finally, it orders the result by customer_id
and month
.
To solve a similar window function interview problem which uses RANK() on DataLemur's free interactive SQL code editor, try this Amazon SQL question asked in a BI Engineer interview:
To identify records in one table that do not appear in another, you can use a LEFT JOIN and examine NULL values in the right-side table.
Say for example you had exported Discover Financial Services's CRM (Customer Relationship Management) database into PostgreSQL, and had a table of sales leads, and a second table of companies.
Here's an example of how a LEFT JOIN
query could find all sales leads that are not associated with a company:
SELECT * FROM sales_leads LEFT JOIN companies ON sales_leads.company_id = companies.id WHERE companies.id IS NULL;
This query brings back all rows from the sales leads table, along with any matching rows from the companies table. If there is no matching row in the companies table, NULL values will be returned for all of the right table's columns.
We then filter out out any rows where the companies.id
column is NULL
, leaving only the sales leads that are NOT associated with a company.
The customer records database of Discover Financial Services contains extensive information about their customers, including when they opened their account, their payment history, balances, if they are currently delinquent, and if they have ever defaulted on a loan. As a data analyst, your task is to filter out customers who have opened their Discover cards after 2015, have a balance of more than $1,000, are currently delinquent in their payment, and have never defaulted on a loan.
cardholders
Example Input:customer_id | open_date | payment_history | balance | delinquent | defaulted |
---|---|---|---|---|---|
23451 | 12/14/2014 | On Time | 950.00 | No | Yes |
89232 | 01/17/2016 | Late | 1500.00 | Yes | No |
78392 | 02/11/2018 | On Time | 1200.00 | No | No |
47621 | 03/06/2017 | Late | 2500.00 | Yes | No |
98231 | 10/29/2013 | Late | 750.00 | Yes | Yes |
SELECT * FROM cardholders WHERE open_date > '12/31/2015' AND balance > 1000 AND delinquent = 'Yes' AND defaulted = 'No';
This PostgreSQL query will filter out cardholders based on the conditions given in the question. It uses WHERE clause to add the conditions, AND operator to ensure all conditions must be met, and comparison operators to set the conditions. This will return a list of customers who opened their accounts after 2015, have a balance more than 1,000, are currently delinquent, and have not defaulted on any loans.
The primary key of a table is a column or set of columns that serves as a unique identifier for each row. It ensures that all rows are distinct and does not allow null values.
For example, say you had stored some Facebook ad campaign data that Discover Financial Services ran:
CREATE TABLE discover_financial_services_fb_campaigns ( CampaignID INTEGER PRIMARY KEY, CampaignName VARCHAR(255), AdSetName VARCHAR(255), AdName VARCHAR(255), Budget DECIMAL(8,2), Targeting VARCHAR(255) );
The CampaignID column is used to uniquely identify each row in the table, and the PRIMARY KEY
constraint ensures that there are no duplicate CampaignID values. This helps to maintain the accuracy of the data by preventing duplicate rows. The primary key is also an important part of the table because it enables you to easily identify and reference specific campaigns in your Facebook Ad data. You can use it to join to other tables in the database, such as a table with data on the results of the campaigns.
Read about how Discover has recently employed Google Gen AI to up thier customer experience!
As an SQL developer at Discover Financial Services, you have been tasked with calculating the average balance for each type of account (e.g., savings, checking, etc.) for the last year. This information is crucial to understand the overall financial wellness of the customers for the company.
accounts
example input:account_id | customer_id | account_type | account_balance |
---|---|---|---|
8475 | 1092 | Savings | $8,000.00 |
2721 | 4301 | Checking | $2,500.00 |
3918 | 5862 | Savings | $7,500.00 |
4819 | 3998 | Savings | $12,000.00 |
7016 | 2540 | Checking | $3,000.00 |
account_type | avg_balance |
---|---|
Savings | $9,166.67 |
Checking | $2,750.00 |
SELECT account_type, AVG(account_balance) AS avg_balance FROM accounts GROUP BY account_type;
This query first categorizes the accounts based on the account type using the GROUP BY clause. It then calculates the average account balance within each group using the AVG function. The result is a table displaying the average balance per account type.
To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for calculating averages or this Alibaba Compressed Mean Question which is similar for usage of financial data.
As a data analyst for Discover Financial Services, you have been given access to the database that contains information about users’ activities on their website. Your manager wants you to calculate the click-through-rate (CTR) for their digital ads. The CTR is the number of users who clicked on our ad divided by the number of times our ad is shown.
The databases contain two tables – the Ad_impressions
table, which records each time an ad is displayed, and the Ad_clicks
table, which records each time a user clicks on an ad.
Ad_impressions
Example Input:impression_id | user_id | ad_id | impression_date |
---|---|---|---|
4401 | 234 | 3474 | 06/08/2022 00:00:00 |
3951 | 782 | 3952 | 06/10/2022 00:00:00 |
6709 | 653 | 2954 | 06/18/2022 00:00:00 |
4112 | 909 | 3474 | 07/26/2022 00:00:00 |
8862 | 470 | 2954 | 07/05/2022 00:00:00 |
Ad_clicks
Example Input:click_id | user_id | ad_id | click_date |
---|---|---|---|
7722 | 234 | 3474 | 06/08/2022 00:02:00 |
6631 | 782 | 3952 | 06/10/2022 00:04:00 |
2953 | 470 | 2954 | 07/05/2022 00:05:00 |
SELECT ad_id, COUNT(DISTINCT ad_clicks.click_id) AS total_clicks, COUNT(DISTINCT Ad_impressions.impression_id) AS total_impressions, (COUNT(DISTINCT ad_clicks.click_id)::decimal / NULLIF(COUNT(DISTINCT Ad_impressions.impression_id), 0)) * 100 AS click_through_rate FROM Ad_impressions LEFT JOIN Ad_clicks ON Ad_impressions.ad_id = Ad_clicks.ad_id AND Ad_impressions.user_id = Ad_clicks.user_id GROUP BY Ad_impression
The above query starts by joining the Ad_impressions
table with the Ad_clicks
table on ad_id
and user_id
. The query counts the total number of unique clicks and impressions per ad and computes the click-through rate as a percentage (total unique clicks divided by total unique impressions). It also prevents division by zero using the NULLIF
function.
To practice a similar problem about calculating rates, try this TikTok SQL Interview Question on DataLemur's interactive SQL code editor:
The key to acing a Discover Financial Services SQL interview is to practice, practice, and then practice some more!
Beyond just solving the above Discover Financial Services SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Facebook, Google and unicorn tech startups.
Each DataLemur SQL question has multiple hints, step-by-step solutions and best of all, there's an interactive SQL code editor so you can easily right in the browser your query and have it checked.
To prep for the Discover Financial Services SQL interview you can also be helpful to practice interview questions from other payment & credit companies like:
But if your SQL skills are weak, don't worry about diving straight into solving questions – refresh your SQL knowledge with this SQL interview tutorial.
This tutorial covers SQL topics like Union vs. UNION ALL and handling date/timestamp data – both of these come up frequently in SQL job interviews at Discover Financial Services.
Beyond writing SQL queries, the other topics tested in the Discover Financial Services Data Science Interview include:
The best way to prepare for Discover Financial Services Data Science interviews is by reading Ace the Data Science Interview. The book's got: