9 American Airlines SQL Interview Questions (Updated 2025)

Updated on

March 28, 2025

At American Airlines data engineers use SQL to analyze passenger flight data and to optimize flight schedules, while juggling conditions like air traffic and weather conditions. That's why American Airlines usually asks SQL coding interview questions to see which candidates are prepared to take on this responsibility.

Thus, to help you prep for the American Airlines SQL interview, we've collected 9 American Airlines Group SQL interview questions in this blog.

American Airlines SQL Interview Questions

9 American Airlines Group SQL Interview Questions

SQL Question 1: Identifying VIP Customers for American Airlines

Based on the customer flight history for American Airlines, you are required to write a query that will identify the top 5 customers who have the highest average frequency of flights per month in the last year. A customer's flight frequency is counted by the number of flights they have taken in any given month.

Data for two tables are provided: customers and flights.

customers is structured as follows:
customer_idfull_namemembership_date
5437John Doe01/15/2015
9821Jane Smith07/29/2018
7373Emma Jones06/11/2019
8280Michael Brown10/26/2020
7353Olivia Taylor04/03/2016
flights is structured as follows:
flight_idcustomer_idflight_date
2547543708/03/2021
8753543708/15/2021
3672982108/05/2021
5246737308/23/2021
8362828008/01/2021

Answer:

A query that could solve this problem would look something like this:

SELECT c.customer_id, c.full_name, AVG(f.number_of_flights) AS avg_monthly_flights FROM customers c JOIN ( SELECT customer_id, COUNT(flight_id) AS number_of_flights, DATE_TRUNC('month', flight_date) AS month FROM flights WHERE flight_date >= (CURRENT_DATE - INTERVAL '1 year') GROUP BY customer_id, month ) f ON c.customer_id = f.customer_id GROUP BY c.customer_id, c.full_name ORDER BY avg_monthly_flights DESC LIMIT 5;

This query works by first creating a subquery (the portion of the query inside the round brackets) that groups the flights by customer and by month, and counts the number of flights for each group. It then joins this result back to the customers table to get the customer names, and calculates the average number of flights per month for each customer. The results are then ordered by this average in descending order and limited to the top 5 customers.

To work on a similar customer analytics SQL question where you can code right in the browser and have your SQL code instantly executed, try this Walmart SQL Interview Question: Walmart SQL Interview Question

SQL Question 2: Top 3 Department Salaries

Assume you had a table of American Airlines employee salary data. Write a SQL query to find the top three highest paid employees within each department.

American Airlines employees Example Input:

employee_idnamesalarydepartment_id
1Emma Thompson38001
2Daniel Rodriguez22301
3Olivia Smith20001
4Noah Johnson68002
5Sophia Martinez17501
8William Davis68002
10James Anderson40001

department Example Input:

department_iddepartment_name
1Data Analytics
2Data Science

Example Output:

department_namenamesalary
Data AnalyticsJames Anderson4000
Data AnalyticsEmma Thompson3800
Data AnalyticsDaniel Rodriguez2230
Data ScienceNoah Johnson6800
Data ScienceWilliam Davis6800

Write a SQL query for this interview question interactively on DataLemur:

Top 3 Department Salaries

Answer:

We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.

WITH ranked_salary AS ( SELECT name, salary, department_id, DENSE_RANK() OVER ( PARTITION BY department_id ORDER BY salary DESC) AS ranking FROM employee ) SELECT d.department_name, rs.name, rs.salary FROM ranked_salary AS rs INNER JOIN department AS d ON rs.department_id = d.department_id WHERE rs.ranking <= 3 ORDER BY d.department_id, rs.salary DESC, rs.name ASC;

If the code above is tough, you can find a step-by-step solution here: Top 3 Department Salaries.

SQL Question 3: Could you describe a self-join and provide a scenario in which it would be used?

A self-join is a type of JOIN where a table is joined to itself. To execute a self-join, you must include the table name twice in the FROM clause and assign a different alias to each instance. You can then join the two copies of the table using a JOIN clause, and use a WHERE clause to specify the relationship between the rows.

One common use case for self-joins is to compare data within a single table. For example, you might use a self-join to compare the salaries of employees within the same department, or to identify pairs of products that are frequently purchased together (like in this real Walmart SQL interview question)[https://datalemur.com/questions/frequently-purchased-pairs].

For a more concrete example, imagine you had website visitor data for American Airlines, exported from the company's Google Analytics account. To help the web-dev team, you needed to analyze pairs of pages for UX or navigational issues. As part of that analysis, you wanted to produce all pairs of URLs, but needed to exclude pairs where both URLs were the same since that is not a valid pair.

You could use the following self-join:

SELECT page1.url AS page_url, page2.url AS referred_from FROM google_analytics AS page1 JOIN google_analytics AS page2 ON page1.referrer_id = page2.id WHERE page1.id <> page2.id;

This query retrieves the url of each page (page1.url) along with the url of the page that referred to it (page2.url). The self-join is performed using the referrer_id field, which identifies the id of the page that referred the visitor to the current page, and excludes any pages that referred to themselves (i.e., data anomalies since pages shouldn't refer to themself).

American Airlines Group SQL Interview Questions

SQL Question 4: Analyze Flight Delays

You are given a table flights with information about flight departures and their delays. Each row represents a unique flight. For this scenario, a delay is considered anytime a flight departure is late by 15 minutes or more. American Airlines wants to analyze its flights' performance by evaluating the average delay from each airport for the past month.

The flights table has the below structure:

flights Example Input:
flight_idorigin_airportdestination_airportdeparture_datedelay_minutes
1LAXDFW2021-09-05 13:15:0020
2LAXJFK2021-09-05 15:23:000
3DFWORD2021-09-08 08:50:0035
4JFKLAX2021-09-08 10:00:0050
5LAXORD2021-09-10 16:10:0020

Calculate the monthly average delay in minutes for each origin airport.

Example Output:
monthorigin_airportavg_delay_minutes
9LAX13.33
9DFW35
9JFK50

Answer:

SELECT DATE_TRUNC('month', departure_date) as month, origin_airport, AVG(delay_minutes) as avg_delay_minutes FROM flights GROUP BY month, origin_airport ORDER BY month DESC, avg_delay_minutes DESC;

This SQL query uses the DATE_TRUNC function to convert the departure_date to the first day of the corresponding month, allowing us to group and compare flights by month. The AVG function is then used to calculate the average delay for each airport within each month.

To practice another window function question on DataLemur's free interactive SQL code editor, solve this Google SQL Interview Question: Google SQL Interview Question

SQL Question 5: Can you explain the distinction between an inner and a full outer join?

An inner join only includes rows from both tables that satisfy the join condition, whereas a full outer join includes all rows from both tables, regardless of whether they match the join condition or not.

For example, suppose you had a table of American Airlines orders and American Airlines customers.

INNER JOIN: When there is a match in the shared key or keys, rows from both tables are retrieved. In this example, an INNER JOIN between the Orders and Customers tables would retrieve rows where the customer_id in the Orders table matches the customer_id in the Customers table.

FULL OUTER JOIN: A FULL OUTER JOIN retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.

SQL Question 6: Filtering Customer Flight Data

You're a data analyst working for American Airlines. You have been tasked by your manager to filter out the customers who had travelled in the business class, who travelled from Dallas to New York during winter (December, January, February) of 2021, and who spent more than $1000 for the ticket.

A great example of this can be seen in American Airlines 2023 financial results!

Assuming that you have a flights table with the following records:

flights Example Input:
passenger_idflight_dateorigindestinationclassfare
47312021-01-12DallasNew YorkBusiness1250
57212021-01-20DallasNew YorkEconomy800
36292021-12-05DallasLos AngelesBusiness1000
81972021-02-18DallasNew YorkBusiness1500
98232021-02-25MiamiNew YorkBusiness1700
Example Output:
passenger_idflight_dateorigindestinationclassfare
47312021-01-12DallasNew YorkBusiness1250
81972021-02-18DallasNew YorkBusiness1500

Answer:

SELECT * FROM flights WHERE class = 'Business' AND origin = 'Dallas' AND destination = 'New York' AND DATE_PART('month', flight_date) IN (12, 1, 2) AND fare > 1000 AND DATE_PART('year', flight_date) = 2021;

This SQL query filters out the required details by using multiple WHERE conditions. DATE_PART function is used to filter out the winter months and year. These records are then displayed by the SELECT statement as the result of this query.

SQL Question 7: What's the purpose of a primary key?

A primary key is a special type of column or set of columns in a database table that is used to identify each row in the table. It helps to ensure that all the rows are distinct and prevents null values. Additionally, primary keys play a crucial role in maintaining the accuracy of the data by making sure that there are no duplicate rows. They are also used to establish relationships between tables.

To create a primary key in a SQL database, you can use the PRIMARY KEY constraint. For example, in the table american_airlines_employees, the EmployeeID column is the primary key.

CREATE TABLE american_airlines_employees ( EmployeeID INTEGER PRIMARY KEY, LastName VARCHAR(255), FirstName VARCHAR(255), Age INTEGER, Salary DECIMAL(8,2) );

SQL Question 8: Filtering Passenger Data

As a data analyst at American Airlines, you have been asked to extract customer records to gain insights into travel trends. The database contains a passengers table that ncludes columns such as id, name, address, email, and frequent_flyer. The email column contains email addresses of all customers and the frequent_flyer column indicates whether the customer is enrolled in the Frequent Flyer program.

Your task is to write a SQL query that filters out all customers whose email provider is 'gmail.com' and are enrolled in the Frequent Flyer program.

passengers Example Input:
idnameaddressemailfrequent_flyer
001John Doe123 Main Streetjohndoe@gmail.comyes
002Jane Smith456 Oak Avenuejanesmith@yahoo.comno
003Peter Parker789 Elm Drivepeterparker@gmail.comyes
004Miles Morales321 Pine Roadmilesmorales@hotmail.comno
005Tony Stark654 Maple Lanetonystark@gmail.comyes
Example Output:
idnameemail
001John Doejohndoe@gmail.com
003Peter Parkerpeterparker@gmail.com
005Tony Starktonystark@gmail.com

Answer:

SELECT id, name, email FROM passengers WHERE email LIKE '%gmail.com' AND frequent_flyer = 'yes';

This query selects the id, name, and email of passengers where the email ends with "gmail.com" and the frequent_flyer status is 'yes'. The '%' before 'gmail.com' is a wildcard character that matches any sequence of characters. The 'AND' operator ensures both conditions must be true. This way, we can identify frequent flyers who use Gmail.

SQL Question 9: Analyzing AA Flight Database

Imagine you are working for American Airlines as a Data Analyst. Your manager gave you two tables: flights and passengers tables. The flights table contains information about the various flights the company had over time and the passengers table contains information about the passengers who flew on those flights. {#Question-9}

Now your manager wants to know which flight had the highest number of passengers from New York (NY) in the year 2021. How would you write an SQL query to get this information?

flights Example Input:
flight_idflight_dateorigindestinationaircraft_type
12021-05-01NYLABoeing 737
22021-07-06LANYAirbus A380
32021-10-15NYMiamiAirbus A320
42021-11-02MiamiNYBoeing 747
passengers Example Input:
passenger_idflight_idname
1011John Doe
1021Jane Smith
1032Peter Parker
1043Tony Stark
1054Clark Kent
1061Bruce Wayne

Answer:

SELECT f.flight_id, f.origin, f.destination, COUNT(p.passenger_id) as total_passengers FROM flights f JOIN passengers p ON f.flight_id = p.flight_id WHERE f.origin = 'NY' AND EXTRACT(YEAR FROM f.flight_date) = 2021 GROUP BY f.flight_id ORDER BY total_passengers DESC LIMIT 1;

Writeup:

The SQL join operation is used to combine rows from two or more tables based on a related column between them. Here, we joined the flights and passengers tables on flight_id. We filtered the records by year of flight and origin using WHERE clause with EXTRACT function and match string. Afterwards we grouped the joined data by flight_id and counted the number of passengers per flight, ordering by the quantity. We only selected the top result with LIMIT 1 to return the flight with the most passengers from NY in 2021.

Because join questions come up frequently during SQL interviews, practice this Snapchat SQL Interview question using JOINS: Snapchat JOIN SQL interview question

How To Prepare for the American Airlines SQL Interview

The key to acing a American Airlines SQL interview is to practice, practice, and then practice some more! Beyond just solving the earlier American Airlines SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Microsoft, Google, Amazon, and tech startups. DataLemur SQL and Data Science Interview Questions

Each SQL question has multiple hints, step-by-step solutions and crucially, there is an online SQL code editor so you can right in the browser run your SQL query answer and have it checked.

To prep for the American Airlines SQL interview it is also a great idea to solve interview questions from other airlines like:

However, if your SQL skills are weak, don't worry about diving straight into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Scientists & Analysts.

SQL tutorial for Data Analytics

This tutorial covers SQL topics like filtering strings using LIKE and window functions like RANK() and ROW_NUMBER() – both of these come up frequently during SQL interviews at American Airlines.

American Airlines Group Data Science Interview Tips

What Do American Airlines Data Science Interviews Cover?

In addition to SQL interview questions, the other types of questions tested in the American Airlines Data Science Interview include:

American Airlines Data Scientist

How To Prepare for American Airlines Data Science Interviews?

The best way to prepare for American Airlines Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from FAANG, tech startups, and Wall Street
  • A Crash Course on Python, SQL & ML
  • Great Reviews (1000+ 5-star reviews on Amazon)

Acing Data Science Interview