At Xero, SQL does the heavy lifting for analyzing product analytics data generate from their suite of accounting and bookkeeping products. Unsurprisingly this is why Xero often tests SQL problems during interviews for Data Analyst, Data Science, and BI jobs.
To help you study for the Xero SQL interview, we've collected 8 Xero SQL interview questions – able to answer them all?
Xero, an online accounting software company, relies heavily on engagement as a key factor of its business model. The most valuable users for Xero are those who use the product very frequently. We'll call those users "whale" users. In this scenario, a clear indicator of frequent use is the number of transactions a user makes within a given month. For this exercise, we want to identify the top 5 users with the most number of transactions for each month in the year 2022.
transaction_id | user_id | transaction_date | amount |
---|---|---|---|
1250 | 567 | 01/02/2022 | 150 |
1300 | 234 | 01/03/2022 | 200 |
1352 | 567 | 01/05/2022 | 75 |
1400 | 234 | 01/08/2022 | 100 |
1450 | 879 | 01/10/2022 | 300 |
1500 | 123 | 02/05/2022 | 350 |
1550 | 234 | 02/06/2022 | 250 |
1600 | 567 | 02/07/2022 | 400 |
1650 | 123 | 02/10/2022 | 300 |
1700 | 679 | 02/15/2022 | 100 |
The output should include the month, user_id and count of transactions with the highest transactions count first.
This query first groups the transactions by month and user and counts the number of transactions for each every month, it then orders the output by month and transaction_count in descending order. The at the end ensures that only the top 5 users (in terms of transaction counts) for each month are included in the final output.
To solve a super-customer analysis question on DataLemur's free online SQL code editor, try this Microsoft Azure Cloud SQL Interview Question:
Since Xero is a software company providing cloud-based accounting software, a relevant data set could be information around subscription sales. Let's say every new subscription is associated with a sales rep who closed the deal.
You are asked to calculate the running total of monthly subscription sales and the monthly ranking of sales reps by the number of closed sales.
Here are two relevant tables. "subscriptions" have details of each subscription sale, including the date of sale and its associated sales rep. "sales_reps" contains the information about the sales reps.
"subscriptions" table:
subscriptions Example Input:
"sales_reps" table:
sales_reps Example Input:
This query first groups the sales by month and sales rep, calculating the total sales for each rep for each month. It then ranks these total sales within each month so we can see who were the top performers. The final query combines these rankings with the names of the reps from the "sales_rep" table, and orders the results by month and rank.
Example Output: (Assuming the table "subscriptions" have more data)
month | rep | total_sales | rank |
---|---|---|---|
June 2022 | Eve | 250 | 1 |
June 2022 | Charlie | 200 | 2 |
July 2022 | Alice | 200 | 1 |
July 2022 | David | 75 | 2 |
Pro Tip: Window functions are a popular SQL interview topic, so practice every window function problem on DataLemur
A join in SQL combines rows from two or more tables based on a shared column or set of columns.
Four types of JOINs exist in SQL. To demonstrate each one, say you had a table of Xero orders and Xero customers.
INNER JOIN: When there is a match in the shared key or keys, rows from both tables are retrieved. In this example, an between the Orders and Customers tables would retrieve rows where the in the Orders table matches the in the Customers table.
LEFT JOIN: A retrieves all rows from the left table (in this case, the Orders table) and any matching rows from the right table (the Customers table). If there is no match in the right table, NULL values will be returned for the right table's columns.
RIGHT JOIN: A retrieves all rows from the right table (in this case, the Customers table) and any matching rows from the left table (the Orders table). If there is no match in the left table, NULL values will be returned for the left table's columns.
FULL OUTER JOIN: A retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be returned for the columns of the non-matching table.
Xero is a cloud-based accounting software platform for small and medium-sized businesses. Its various products manage invoicing, bank reconciliation, bookkeeping and more. Assume you have a 'customers' table that includes details like the customer's name, the date they joined, and their total outstanding balance, plus an 'invoices' table that has invoices details including the invoice number, the customer they are linked to, the date of issue, and whether or not they are paid.
Your goal is to write a query that lists all customers from the 'customers' table who have joined after January 1, 2025, and have at least one unpaid invoice in the 'invoices' table.
customer_id | name | join_date | outstanding_balance |
---|---|---|---|
6542 | ABC Corp | 2023-06-25 | 2000 |
7125 | XYZ Ltd | 2025-02-18 | 5000 |
8193 | DEF Industries | 2025-07-13 | 1500 |
6310 | GHI LLC | 2026-01-04 | 3500 |
9051 | JKL Inc | 2025-11-29 | 2500 |
invoice_id | customer_id | issue_date | is_paid |
---|---|---|---|
1201 | 6542 | 2023-07-01 | true |
1282 | 7125 | 2025-02-27 | false |
1393 | 8193 | 2025-08-01 | true |
1404 | 6310 | 2026-01-11 | false |
1501 | 9051 | 2025-12-03 | false |
1601 | 7125 | 2025-11-16 | true |
customer_id | name | join_date | outstanding_balance |
---|---|---|---|
7125 | XYZ Ltd | 2025-02-18 | 5000 |
6310 | GHI LLC | 2026-01-04 | 3500 |
9051 | JKL Inc | 2025-11-29 | 2500 |
The query joins the 'customers' and 'invoices' tables on the customer_id, then filters for customers who joined after January 1, 2025, and have unpaid invoices. The resulting table displays the ID, name, join date, and outstanding balance of the selected customers.
A foreign key is a field in a database table that links to the primary key of another table, establishing a connection between the two tables. For example, let's analyze Xero's Google Ads campaigns data:
:
+------------+------------+------------+------------+ | ad_id | campaign_id| keyword | click_count| +------------+------------+------------+------------+ | 1 | 201 | Xero reviews | 120 | | 2 | 202 | Xero pricing | 150 | | 3 | 101 | buy Xero | 65 | | 4 | 101 | Xero alternatives | 135 | +------------+------------+------------+------------+
is a foreign key that connects to the of the corresponding Google Ads campaign. This establishes a relationship between the ads and their campaigns, enabling easy querying to find which ads belong to a specific campaign or which campaigns a specific ad belongs to.
The table may also have multiple foreign keys that reference primary keys in other tables. For example, and foreign keys could be used to link each ad to its ad group and the Google Ads account that the campaigns belong to, respectively.
At Xero, we cater to businesses of various industries by providing them with online accounting software. We provide various products including Accounting & Invoices, Payroll, Projects, and Expenses. We'd like to understand our monthly sales figures by different products to better track our sales performance.
Suppose we have a table with the following columns:
sale_id | user_id | sale_date | product_id | sale_amount |
---|---|---|---|---|
1001 | 1235 | 02/21/2022 | 9001 | $75 |
1002 | 4567 | 02/25/2022 | 9002 | $120 |
1003 | 2315 | 03/10/2022 | 9001 | $75 |
1004 | 8743 | 03/15/2022 | 9002 | $120 |
1005 | 1235 | 04/01/2022 | 9001 | $75 |
The question is to write a SQL query that provides the average monthly sales by each product.
This query groups rows by the month of and . For each of these groups, it calculates the average sale amount using the aggregate function. We use function to extract the month from . The result is ordered by and within that, by the month.
mth | product_id | avg_sales |
---|---|---|
02 | 9001 | $75.00 |
02 | 9002 | $120.00 |
03 | 9001 | $75.00 |
03 | 9002 | $120.00 |
04 | 9001 | $75.00 |
A cross-join, also known as a cartesian join, is a JOIN that produces the cross-product of two tables. In a cross-join, each row from the first table is matched with every row from the second table, resulting in a new table with a row for each possible combination of rows from the two input tables.
Suppose you were building a Neural Network ML model, that tried to score the probability of a customer buying a Xero product. Before you started working in Python and Tensorflow, you might want to do some Exploratory Data Analysis (EDA) in SQL, and generate all pairs of customers and Xero products.
Here's a cross-join query you could use to find all the combos:
Cross-joins are great for generating all possible combinations, but they can also create really big tables if you're not careful. For example, if you had 10,000 potential customers, and Xero had 500 different product SKUs, you'd get 5 million rows as a result!!
In this question, we are given two tables. The table provides personal data about each customer, such as their , , , and . The table logs all purchases made by customers, with columns for , , , , and .
The task is to write a SQL query that joins these two tables and determines the total quantity of each product purchased by each customer using only their and .
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | john.doe@example.com |
2 | Jane | Smith | jane.smith@example.com |
3 | Bob | Johnson | bob.johnson@example.com |
purchase_id | customer_id | product_id | quantity | purchase_date |
---|---|---|---|---|
9782 | 1 | 251 | 1 | 06/18/2022 12:35:00 |
9845 | 2 | 372 | 3 | 06/20/2022 16:22:00 |
9892 | 1 | 251 | 2 | 06/21/2022 10:15:00 |
9937 | 3 | 483 | 5 | 06/22/2022 14:40:00 |
9993 | 2 | 372 | 1 | 06/23/2022 15:00:00 |
first_name | last_name | product_id | total_quantity |
---|---|---|---|
John | Doe | 251 | 3 |
Jane | Smith | 372 | 4 |
Bob | Johnson | 483 | 5 |
Here is the PostgreSQL query that would yield the desired result:
This query first performs an inner join between the and tables, using as the join condition. This links each purchase to the customer who made it.
Then, it groups the joined data by , , and . For each group, it calculates the total quantity of purchases made.
Finally, the results are sorted in descending order by . This means the customers who bought the most of a certain product will appear first in the results.
Since joins come up so often during SQL interviews, try this interactive Snapchat SQL Interview question using JOINS:
Assuming that you've already got basic SQL skills, the next best tip we have to prepare for the Xero SQL interview is to solve as many practice SQL interview questions as you can! Besides solving the above Xero SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Microsoft, Google, Amazon, and tech startups.
Each DataLemur SQL question has hints to guide you, step-by-step solutions and crucially, there is an online SQL code editor so you can instantly run your query and have it executed.
To prep for the Xero SQL interview it is also useful to practice SQL questions from other tech companies like:
However, if your SQL skills are weak, forget about going right into solving questions – strengthen your SQL foundations with this SQL tutorial for Data Analytics.
This tutorial covers SQL concepts such as Union vs. UNION ALL and transforming strings with CONCAT()/LOWER()/TRIM() – both of which pop up frequently in Xero SQL interviews.
Besides SQL interview questions, the other topics tested in the Xero Data Science Interview are:
To prepare for Xero Data Science interviews read the book Ace the Data Science Interview because it's got: