9 Atlassian SQL Interview Questions (Updated 2024)

At Atlassian, SQL crucial for extracting and analyzing user behavior data to enhance software tools, and managing databases for efficient retrieval and update of product information. Unsurprisingly this is why Atlassian asks SQL problems during interviews for Data Science, Data Engineering and Data Analytics jobs.

So, if you're studying for a SQL Interview, we've curated 9 Atlassian SQL interview questions to practice, which are similar to recently asked questions at Atlassian – how many can you solve?

9 Atlassian SQL Interview Questions

SQL Question 1: Average Bug Resolutions Time by Teams

Atlassian provides products like JIRA that helps keep track of tasks, bugs and issues among other features. Assume you have two tables, teams and bugs. The teams table contains information about each team and its team_id, and the bugs table contains information about each bug, its bug_id, the team who resolved it and the start and resolution date.

The objective is to write a SQL query to calculate the average resolution time for bugs for each team.

teams Example Input:
team_idteam_name
1Team Alpha
2Team Bravo
3Team Charlie
bugs Example Input:
bug_idteam_idcreated_atresolved_at
112018-06-09 00:00:002018-06-12 00:00:00
212018-06-10 00:00:002018-06-15 00:00:00
322018-06-11 00:00:002018-06-13 00:00:00
422018-06-12 00:00:002018-06-16 00:00:00
532018-06-13 00:00:002018-06-15 00:00:00

Answer:

The SQL query will perform a JOIN to connect the two tables on team_id, calculate the difference between the resolved_at and created_at for each bug, then group by team_id to calculate average resolution time.

Here's a possible PostgreSQL query to solve this:

SELECT t.team_name, AVG(EXTRACT(EPOCH FROM (b.resolved_at - b.created_at)))/3600 AS average_resolution_time_hours FROM bugs b JOIN teams t ON b.team_id = t.team_id GROUP BY t.team_name ORDER BY average_resolution_time_hours;

This query first measures resolution time in seconds for each bug by subtracting created_at from resolved_at and converting the interval to seconds using the EXTRACT(EPOCH FROM interval) function. It then averages these resolution times by team. The output will be a list of teams and their average resolution time in hours. Please note that this query calculates the time in hours. To show the time in days or weeks, adjust the divisor accordingly.

To practice a similar problem about calculating rates, solve this TikTok SQL question within DataLemur's online SQL code editor: Signup Activation Rate SQL Question

SQL Question 2: Determining Most Frequently Used Atlassian Products

Atlassian provides a range of software development tools. Suppose we have a database that tracks the usage of each tool by different users. We are interested in finding out which tool was most frequently used over the last month.

To track this usage, we will use a simple structure: a 'User' table that stores the user data and a 'Usage' table that stores each occurrence of a user interacting with a software tool.

users Sample Input:
user_iduser_name
1John
2Paul
3George
4Ringo
usage Sample Input:
usage_iduser_idproduct_namedate
11Jira06/22/2022
22Jira06/15/2022
31Confluence06/02/2022
43Bitbucket06/15/2022
54Trello06/25/2022
63Bitbucket06/30/2022
71Jira06/19/2022
82Jira06/15/2022
93Bitbucket06/15/2022
101Confluence06/23/2022

Your task is to write a query that will return the name of the software tool that was most frequently used over the last month and the total number of usages.

Answer:

SELECT product_name, count(*) AS usage_count FROM usage WHERE EXTRACT(MONTH FROM date) = EXTRACT(MONTH FROM DATE_TRUNC('month', NOW())) GROUP BY product_name ORDER BY usage_count DESC LIMIT 1;

This query groups the records in the 'usage' table by 'product_name', counts the number of occurrences for each group, and limits the result to the one with the highest count. The WHERE clause ensures that only the records from the current month are considered. The EXTRACT function is used to get the month part of the date in PostgreSQL. The DATE_TRUNC function returns the first day of the current month. If there are ties for the most frequent usage, this query will return one of them. If all usages should be returned in case of ties, the LIMIT clause can be removed.

To solve a similar problem on DataLemur's free online SQL code editor, try this SQL interview question asked by Facebook: SQL interview question asked by Facebook

SQL Question 3: What is the purpose of the SQL constraint UNIQUE?

{#Question-3}

The UNIQUE constraint makes sure that all values in a column are distinct. It is often paired with other constraints, like NOT NULL, to ensure that the data follows certain rules.

For example, say you were an analyst on the marketing team at Atlassian, and had access to a database on marketing campaigns:

CREATE TABLE atlassian_campaigns ( campaign_id INTEGER PRIMARY KEY, campaign_name VARCHAR(255) NOT NULL UNIQUE, start_date DATE NOT NULL, end_date DATE NOT NULL, budget DECIMAL(10,2) NOT NULL );

In this example, the UNIQUE constraint is applied to the "campaign_name" field to ensure that each campaign has a unique name. This helps to ensure the integrity of the data in the atlassian_campaigns table and prevents errors that could occur if two campaigns had the same name.

Atlassian SQL Interview Questions

SQL Question 4: Average Duration of Resolved Issues

Atlassian, being a software development company, often uses issue tracking systems like Jira. Let's assume your role would involve analyzing patterns across resolved issues. The question is: "What is the average duration in days it takes for each project to resolve an issue?"

issues Example Input:
issue_idproject_idstatusreported_dateresolved_date
151789resolved01/02/202203/04/2022
262456resolved01/05/202202/06/2022
734789open02/03/2022null
456123resolved06/01/202206/20/2022
789456resolved08/01/202208/30/2022
102123resolved05/23/202207/14/2022
Example Output:
project_idavg_duration
78960.00
45664.00
12378.50

Answer:

SELECT project_id, AVG(resolved_date - reported_date) AS avg_duration FROM issues WHERE status = 'resolved' GROUP BY project_id;

This query first filters by the status of the issues, excluding any issues that aren't resolved. It then subtracts the reported date from resolved date to get the duration it took to resolve each issue. It groups by project_id and finds the average duration for each group (i.e., each project). The AVG function takes into account that the dates are treated as numbers in this context, providing the average duration in days.

To practice a related SQL problem on DataLemur's free online SQL coding environment, solve this Facebook SQL Interview question: SQL interview question asked by Facebook

SQL Question 5: Can you define what a database index is, and give some examples of different types of indexes?

{#Question-5}

A database index is a way to optimize the performance of a database by reducing the amount of data that needs to be searched to retrieve a record.

There are several types of indexes that can be used in a database:

  1. Primary index: a unique identifier is used to access the row directly.
  2. Unique index: used to enforce the uniqueness of the indexed columns in a table.
  3. Composite index: created on multiple columns of a table, is used to speed up the search process for multiple columns
  4. Clustered index: determines the physical order of the data in a table

For a concrete example, say you had a table of Atlassian customer payments with the following columns: payment_id, customer_id, payment_amount, and payment_date.

Here's what a clustered index on the payment_date column would look like:

CREATE CLUSTERED INDEX payment_date_index ON atlassian_customer_payments (payment_date)

A clustered index on the payment_date column would determine the physical order of the records in the table based on the payment_date. This means that the records with the earliest payment_date values would be stored together physically in the table, followed by records with later payment_date values.

Having a clustered index on the payment_date column can speed up queries that filter or sort the data based on the payment_date, as the records are already physically sorted in the table. For example, if you want to retrieve all the payments made in the month of January, the database can use the clustered index to quickly locate and retrieve the desired records without having to search through the entire table.

SQL Question 6: Average Monthly Sales Per Product

Atlassian sells a variety of software development products. As a data analyst at Atlassian, you are asked to find out the average monthly sales per product. To answer this, you'll need to group the data by month and product, and then calculate the average sales using the AVG function.

sales Example Input:
sale_idproduct_idsale_dateunit_pricequantity
1Jira06/08/2022 00:00:001005
2Confluence06/10/2022 00:00:002003
3Bitbucket06/18/2022 00:00:005010
4Jira07/12/2022 00:00:001008
5Bitbucket07/22/2022 00:00:00504
Example Output:
monthproductavg_sales
6Jira500
6Confluence600
6Bitbucket500
7Jira800
7Bitbucket200

Answer:

SELECT EXTRACT(MONTH FROM sale_date) as month, product_id as product, AVG(unit_price * quantity) as avg_sales FROM sales GROUP BY month, product ORDER BY month, product;

This query functions by first extracting the month from each sale date using the EXTRACT function. It then groups the data by month and product id, and finally calculates the average sales for each group using PostgreSQL's AVG function. The ORDER BY clause is used to order the results first by month, then by product for easier readability.

To practice another question about calculating rates, try this SQL interview question from TikTok within DataLemur's online SQL code editor: Signup Activation Rate SQL Question

SQL Question 7: What sets a cross join apart from a natural join?

A cross join is a type of JOIN operation in SQL that creates a new table by combining each row from the first table with every row from the second table. It is also known as a cartesian join.

For example, say you worked on the Marketing Analytics team at <comapany_name>, and needed to understand what advertising copy (text) goes best with what advertising creative (the photo that gets posted with the advertisement copy).

Here is an example of a cross-join between a table of ad_copy and a table of ad_creative:

SELECT ad_copy.copy AS ad_copy, ad_creative.image_url AS ad_creative FROM ad_copy CROSS JOIN ad_creative;

A natural join, on the other hand, is a type of JOIN that combines rows from two or more tables based on their common columns. It is called a "natural" join because it is based on the natural relationship that exists between the common columns in the tables being joined.

For an example of each one, say you had sales data exported from Atlassian's Salesforce CRM stored in a datawarehouse which had two tables: sales and atlassian_customers.

An INNER JOIN (which is a type of natural join) combines the two tables on the common customer_id field

SELECT * FROM sales INNER JOIN atlassian_customers ON sales.customer_id = atlassian_customers.id

This query will return rows from the sales and atlassian_customers tables that have matching customer id values. Only rows with matching customer_id values will be included in the results.

One main difference between cross joins and natural joins is that cross joins do not require any common columns between the tables being joined, while natural joins do. Another difference is that cross joins can create very large tables if the input tables have a large number of rows, while natural joins will only return a table with the number of rows equal to the number of matching rows in the input tables.

SQL Question 8: Customer and Product Analysis

You are given two tables, customers and products. The customers table holds customer_id, region and signup_time. The products table contains product_id, customer_id and purchase_time.

Write a SQL query to find out the number of customers from each region who made their first purchase within one month of signing up.

customers Example Input:
customer_idregionsignup_time
1APAC2022-01-01 00:00:00
2AMER2022-02-01 00:00:00
3EMEA2022-03-10 00:00:00
4AMER2022-04-15 00:00:00
5APAC2022-05-20 00:00:00
products Example Input:
product_idcustomer_idpurchase_time
100112022-01-20 00:00:00
200222022-03-05 00:00:00
300332022-03-15 00:00:00
400442022-04-02 00:00:00
500522022-04-01 00:00:00

Answer:

SELECT c.region, count(DISTINCT p.customer_id) as num_customers FROM customers c JOIN products p ON c.customer_id = p.customer_id WHERE p.purchase_time BETWEEN c.signup_time AND c.signup_time + INTERVAL '1 month' GROUP BY c.region;

This query joins the customers and products table on the customer_id. It then filters the rows where the purchase_time is within one month of the signup_time. It finally groups the data by region and counts the distinct number of customer_ids for each group. This gives us the number of customers from each region who made their first purchase within one month of signing up.

To solve another question about calculating rates, solve this SQL interview question from TikTok on DataLemur's online SQL code editor: TikTok SQL Interview Question

SQL Question 9: Monthly Average User Activity

Atlassian is a software company that offers several popular products like Jira and Confluence which are used by development teams to track issues and manage documents respectively. They would be interested in understanding how users are utilizing their tools over time.

Suppose you are given a table called activity_logs containing log entries each time a user creates an issue in Jira or edits a document in Confluence. The table includes columns for log_id (a unique identifier for the log entry), user_id (identifying the user who made the entry), product (the name of the product that was used, either 'Jira' or 'Confluence'), timestamp (the date and time of the activity), and duration (the time in minutes the user spent on the activity).

Write a SQL query to find the average duration of user activity for each product for each month of the year 2022.

activity_logs Example Input:
log_iduser_idproducttimestampduration
400156Jira01/15/2022 10:00:0015
789678Confluence02/05/2022 11:30:0030
644598Jira03/10/2022 16:00:0045
210956Confluence04/12/2022 09:00:0060
520767Jira05/17/2022 14:00:00120
372882Confluence06/26/2022 15:00:0010
124698Jira07/02/2022 13:30:0015
589467Confluence08/31/2022 10:00:0020
Example Output:
monthproductavg_duration
1Jira15.00
2Confluence30.00
3Jira45.00
4Confluence60.00
5Jira120.00
6Confluence10.00
7Jira15.00
8Confluence20.00

Answer:

SELECT EXTRACT(MONTH FROM timestamp) as month, product, AVG(duration) as avg_duration FROM activity_logs WHERE EXTRACT(YEAR FROM timestamp) = 2022 GROUP BY month, product ORDER BY month, product;

This query first filters out activities from 2022. It then groups the data by the month and product, and calculates the average duration for each group. The result is ordered by month and then product.

To solve a related problem on DataLemur's free online SQL coding environment, attempt this Facebook SQL Interview question: Meta SQL interview question

How To Prepare for the Atlassian SQL Interview

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the earlier Atlassian SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Microsoft, Google, and Facebook. DataLemur Questions

Each SQL question has hints to guide you, full answers and best of all, there's an interactive coding environment so you can right in the browser run your query and have it graded.

To prep for the Atlassian SQL interview you can also be helpful to practice SQL problems from other tech companies like:

In case your SQL coding skills are weak, forget about diving straight into solving questions – improve your SQL foundations with this free SQL for Data Analytics course.

DataLemur SQL Course

This tutorial covers topics including CASE/WHEN statements and different types of joins – both of these show up frequently in Atlassian SQL interviews.

Atlassian Data Science Interview Tips

What Do Atlassian Data Science Interviews Cover?

For the Atlassian Data Science Interview, in addition to SQL query questions, the other types of questions to prepare for are:

  • Probability & Stats Questions
  • Coding Questions in Python or R
  • Data Case Study Questions
  • Machine Learning and Predictive Modeling Questions
  • Behavioral Interview Questions

Atlassian Data Scientist

How To Prepare for Atlassian Data Science Interviews?

The best way to prepare for Atlassian Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from Facebook, Google & startups
  • A Crash Course covering Python, SQL & ML
  • Great Reviews (900+ 5-star reviews on Amazon)

Ace the Data Science Interview by Nick Singh Kevin Huo