At American Express, SQL is used across the company for analyzing transaction data for fraud detection, and segmenting customers for targeted marketing campaigns. Because of this, American Express typically asks SQL problems in interviews for Data Analytics, Data Science, and Data Engineering jobs.
Thus, to help you ace the American Express SQL interview, we'll cover 11 American Express SQL interview questions in this blog.
For American Express, a key activity may be the use of an American Express Card for transactions. 'Whale' users or VIP users could be defined as those customers who have a high frequency of transactions above a certain threshold (e.g. $5000). Given a transaction table, write a SQL query to identify these 'Whale' users.
transaction_id | customer_id | transaction_date | transaction_amount |
---|---|---|---|
001 | C001 | 08/01/2022 00:00:00 | 4800 |
002 | C002 | 08/02/2022 00:00:00 | 6800 |
003 | C003 | 08/03/2022 00:00:00 | 5000 |
004 | C001 | 08/10/2022 00:00:00 | 5200 |
005 | C002 | 08/22/2022 00:00:00 | 7000 |
customer_id | transaction_count |
---|---|
C001 | 2 |
C002 | 2 |
This query counts the number of 'big' transactions (greater than or equal to $5000) for each customer using and . We then filter on (using the clause) all customers who have more than one 'big' transaction. These are the 'Whale' customers for American Express as per the provided definition.
To work on a similar customer analytics SQL question where you can code right in the browser and have your SQL solution automatically checked, try this Walmart Labs SQL Interview Question:
Given a table of American Express employee salaries, write a SQL query to find all employees who make more money than their direct boss.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | |
2 | Daniel Rodriguez | 2230 | 1 | 10 |
3 | Olivia Smith | 8000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 8 |
5 | Sophia Martinez | 1750 | 1 | 10 |
8 | William Davis | 7000 | 2 | NULL |
10 | James Anderson | 4000 | 1 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
This is the output because Olivia Smith earns $8,000, surpassing her manager, William Davis who earns 7,800.
Test your SQL query for this question and run your code right in DataLemur's online SQL environment:
First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.
If the solution above is hard to understand, you can find a detailed solution with hints here: Employee Salaries Higher Than Their Manager.
Check out the American Express career page and see what job listing suits your skills!
Database normalization is the process of breaking down a table into smaller and more specific tables and defining relationships between them via foreign keys. This minimizes redundancy, and creates a database that's more flexible, scalable, and easier to maintain. It also helps to ensure the integrity of the data by minimizing the risk of data inconsistencies and anomalies.
You are an analyst at American Express, and you have a database table named which contains the transaction data of users. The columns are , , and .
Your task is to write a SQL query to calculate the average transaction amount per year for each client, where the years are in the range of 2018 to 2022.
transaction_id | user_id | transaction_date | transaction_amount |
---|---|---|---|
1 | 269 | 08/15/2018 | 500 |
2 | 478 | 11/25/2018 | 400 |
3 | 269 | 01/05/2019 | 1000 |
4 | 123 | 10/20/2020 | 600 |
5 | 478 | 07/05/2021 | 700 |
6 | 123 | 03/05/2022 | 900 |
year | user_id | avg_transaction_amount |
---|---|---|
2018 | 269 | 500 |
2018 | 478 | 400 |
2019 | 269 | 1000 |
2020 | 123 | 600 |
2021 | 478 | 700 |
2022 | 123 | 900 |
The window function is not necessary in this case because the query does not require any calculations over a window of rows. Here the EXTRACT function is used in PostgreSQL to get the year from the date. We group the result by 'year' and 'user_id' to get the average transaction amount for each user for each year between 2018 and 2022.
To solve a related window function SQL problem on DataLemur's free interactive SQL code editor, solve this Google SQL Interview Question:
A DBMS (database management system), in order to ensure transactions are relaible and don't ruin the integrity of the data, tries to mantain the following ACID properties: Atomicity, Consistency, Isolation, and Durability.
To make this concept more concrete, here is what each of the ACID properties would mean in the context of banking transactions:
American Express, a multinational financial services corporation, has a database storing customer data. For marketing purposes, they need to filter out customers who are based in New York, have a credit score above 700, and their total transaction amount in the last year is more than $5000.
The task is to write a SQL query that returns the IDs of the customers who meet these conditions. Return the result in ascending order.
customer_id | name | location | credit_score |
---|---|---|---|
101 | John Doe | New York | 720 |
102 | Jane Doe | California | 680 |
103 | Michael Smith | New York | 750 |
104 | Emily Johnson | New York | 690 |
105 | William Brown | Texas | 710 |
transaction_id | customer_id | transaction_amount | transaction_date |
---|---|---|---|
201 | 101 | 6500 | 2021-10-22 |
202 | 102 | 4000 | 2021-07-15 |
203 | 103 | 5500 | 2021-08-31 |
204 | 104 | 4800 | 2021-11-05 |
205 | 105 | 5100 | 2021-09-20 |
customer_id |
---|
101 |
103 |
In this query, we first create a subquery that gets the total transaction amount of each customer in the last year. Then we join this subtable with the customer table, and then filter the customers who are based in New York, have a credit score above 700 and their total transaction amount in the last year is more than $5000. Order the results by customer_id in ascending order.
A cross join is a JOIN operation in SQL that creates a new table by pairing each row from the first table with every row from the second table. It is also referred to as a cartesian join. In contrast, a natural join combines rows from two or more tables based on their common columns, forming a new table. Natural joins are called "natural" because they rely on the natural relationship between the common columns in the joined tables.
Here's an example of a cross join:
Here's a natural join example using two tables, American Express employees and American Express managers:
This natural join returns all rows from American Express employees where there is no matching row in managers based on the column.
One significant difference between cross joins and natural joins is that the former do not require common columns between the tables being joined, while the latter do. Another distinction is that cross joins can generate very large tables if the input tables have a large number of rows, while natural joins only produce a table with the number of rows equal to the number of matching rows in the input tables.
American Express runs several marketing campaigns over different channels. The data from these campaigns is stored in two tables:
campaigns: This table has details of each campaign such as campaign_id, channel (E.g., Email, Web, App), and date.
clicks: This table has data of the customer actions. Anytime a customer clicks on a campaign, it records the campaign_id, customer_id, and the action (i.e., whether the action was 'Viewed', 'Clicked', or 'Converted').
Given this data, write a SQL query to calculate the click-through-rate (CTR) of each campaign. The CTR is the ratio of customers who clicked on a campaign to the number of total customers who viewed the campaign. CTR is calculated as .
campaign_id | channel | date |
---|---|---|
1 | 07/01/2022 | |
2 | Web | 07/02/2022 |
3 | App | 07/03/2022 |
campaign_id | customer_id | action |
---|---|---|
1 | 3452 | Viewed |
1 | 4563 | Clicked |
2 | 3765 | Viewed |
2 | 8765 | Clicked |
2 | 1234 | Viewed |
3 | 3452 | Clicked |
3 | 5632 | Viewed |
This SQL query first joins the two tables on campaign_id. It then creates sums of 'Clicked' and 'Viewed' actions using CASE WHEN clauses in the SELECT statement. It calculates the click-through-rate for each campaign by dividing the sum of 'Clicked' actions by the sum of 'Viewed' actions.
To practice a similar SQL interview question on DataLemur's free online SQL coding environment, attempt this Facebook SQL Interview question:
As an SQL specialist in American Express, your task is to analyze the card usage patterns of the customers. Create a SQL query that will help you find out the average cost of transactions made by the cardholders on a monthly basis.
For this question, we'll presume that there's a table called which stores the card transaction information. Here is a sample of the table:
transaction_id | card_holder_id | transaction_date | transaction_cost |
---|---|---|---|
1121 | 005 | 01/04/2022 00:00:00 | 100.50 |
5432 | 020 | 01/06/2022 00:00:00 | 250.73 |
3478 | 020 | 02/10/2022 00:00:00 | 305.52 |
9991 | 033 | 02/28/2022 00:00:00 | 80.34 |
5674 | 005 | 03/20/2022 00:00:00 | 120.76 |
You want to get the output in the following format:
month | card_holder | avg_transaction_cost |
---|---|---|
1 | 005 | 100.50 |
1 | 020 | 250.73 |
2 | 020 | 305.52 |
2 | 033 | 80.34 |
3 | 005 | 120.76 |
In the provided PostgreSQL query, we're using the function to get the month from the column. Then, we group the records by month and to calculate the average cost of transactions for each card holder per month. This information will aid in understanding the spending pattern of the card holders.
The CHECK constraint is used to set a rule for the data in a column. If a row is inserted or updated and the data in the column does not follow the rule specified by the CHECK constraint, the operation will be unsuccessful.The CHECK constraint is often used in conjunction with other constraints, such as NOT NULL or UNIQUE.
You might consider implementing the CHECK constraint in your database if you want to ensure that certain data meets specific conditions. This can be helpful for maintaining the quality and reliability of your data.
For example, you might use a CHECK constraint to ensure that a column contains only positive numbers, or that a date is within a certain range.
As part of marketing efforts, American Express is trying to identify trends in what types of email providers its customers use. In the customer records database, the email of each customer is stored. Write a SQL query that will return a list of all customers who use a specific email provider for their account (e.g., ).
Assume there is a table as shown below:
customer_id | first_name | last_name | email_address |
---|---|---|---|
1 | John | Doe | john.doe@gmail.com |
2 | Jane | Smith | jane_smith@yahoo.com |
3 | James | Brown | james_brown@outlook.com |
4 | Emily | Davis | emily_davis@gmail.com |
5 | Robert | Johnson | robert_johnson@company.com |
The task is to find customers who use as their email provider.
This query uses the operator together with the wildcard to match any email address that ends in . The symbol is used to represent zero, one or more characters. This particular setup will only return the rows where the email address contains at the end.
first_name | last_name | email_address |
---|---|---|
John | Doe | john.doe@gmail.com |
Emily | Davis | emily_davis@gmail.com |
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. In addition to solving the above American Express SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Microsoft, Google, Amazon, and tech startups.
Each interview question has hints to guide you, detailed solutions and best of all, there is an interactive SQL code editor so you can easily right in the browser your SQL query and have it executed.
To prep for the American Express SQL interview it is also helpful to practice SQL problems from other payment & credit companies like:
In case your SQL foundations are weak, forget about going right into solving questions – strengthen your SQL foundations with this DataLemur SQL tutorial.
This tutorial covers things like LEAD/LAG window functions and creating summary stats with GROUP BY – both of these show up often during American Express SQL interviews.
In addition to SQL query questions, the other topics tested in the American Express Data Science Interview include:
I'm sort of biased, but I believe the best way to prep for American Express Data Science interviews is to read the book Ace the Data Science Interview.
The book covers 201 data interview questions sourced from Google, Microsoft & tech startups. It also has a crash course covering SQL, Product-Sense & ML. And finally it's helped a TON of people, which is how it's gotten to be an Amazon best-seller with over 1000+ 5-star reviews.