At Asana, SQL is crucial for analyzing user behavior data to improve product features and generating business intelligence reports to drive strategic decision-making. That's why Asana almost always evaluates jobseekers on SQL query questions in interviews for Data Science, Analytics, and & Data Engineering jobs.
As such, to help you study for the Asana SQL interview, this blog covers 9 Asana SQL interview questions – can you answer each one?
As a part of the data analysis team in Asana, you are tasked with analyzing user behavior. To help improve the platform, you are asked to calculate the average time (in days) it takes for tasks to be completed by users on a monthly basis. You need to consider the tasks created and completed within the window of each calendar month.
task_id | user_id | create_date | complete_date |
---|---|---|---|
3601 | 21 | 08/01/2022 00:00:00 | 08/05/2022 00:00:00 |
5231 | 15 | 08/03/2022 00:00:00 | 08/15/2022 00:00:00 |
1029 | 34 | 08/15/2022 00:00:00 | 08/25/2022 00:00:00 |
4452 | 21 | 09/02/2022 00:00:00 | 09/08/2022 00:00:00 |
3067 | 15 | 09/05/2022 00:00:00 | 09/07/2022 00:00:00 |
mth | avg_completion_time |
---|---|
8 | 10.00 days |
9 | 4.00 days |
This SQL query extracts the month part from the create_date and groups by it, taking the average of the difference between the complete_date and create_date. It only includes the tasks completed within the same month they were created. The result is ordered by month to see the trend of average completion time over the months. The AVG function will give the average completion time in days since we are handling dates.
p.s. Window functions show up pretty frequently during SQL interviews, so practice the 27+ window function questions on DataLemur
Consider you are designing a project tracking system for Asana. We have two tables called and .
The table has three columns: (identifier for the project), (identifier for the project manager), and (name of the project).
The table also has four columns: (identifier for the task), (identifier for the project the task is associated with), (identifier for the person assigned to the task), and (status of the task).
Write a SQL query that calculates the percentage of tasks completed for each project.
project_id | project_manager_id | project_name |
---|---|---|
1 | 123 | "Project Alpha" |
2 | 265 | "Project Beta" |
3 | 362 | "Project Gamma" |
task_id | project_id | assignee_id | task_status |
---|---|---|---|
104 | 1 | 065 | "Done" |
567 | 1 | 123 | "Not Started" |
689 | 2 | 362 | "In Progress" |
609 | 2 | 192 | "Done" |
003 | 3 | 981 | "Done" |
300 | 3 | 192 | "In Progress" |
Here's how you can write the SQL query using PostgreSQL:
This query first joins the and tables on . Then, it calculates the percentage of tasks completed in each project. The statement is used to count the tasks with "Done" status, and then this count is divided by total count of tasks. The function and clause are used to group these calculations by project name. The final result is the percentage of tasks completed for each project.
is used to combine the output of multiple statements into one big result!
Suppose you were a Data Analyst at Asana working on a Sales Analytics project, and had data on sales leads from both the company's Salesforce CRM and it's legacy Hubspot CRM. To write a query to analyze leads created before 2023 started, across both CRMs, you could use in the following way:
filters out duplicates, so each email/job title/company only shows up once (even if it's in both Salesforce and HubSpot CRMs).
On the other hand, the operator does NOT filter out duplicates, so if a lead shows up in both CRMs, it'll be listed twice in the output of .
Given the database records of users and their tasks, filter the records to find all active users who have completed more than 50% of their tasks in the last month.
user_id | registration_date | is_active |
---|---|---|
001 | 2020-01-01 | True |
002 | 2019-12-31 | True |
003 | 2021-01-10 | False |
task_id | user_id | task_completion_date | is_completed |
---|---|---|---|
0001 | 001 | 2022-06-29 | True |
0002 | 001 | 2022-06-30 | True |
0003 | 001 | 2022-07-01 | False |
0004 | 002 | 2022-06-30 | False |
0005 | 002 | 2022-07-01 | False |
0006 | 003 | 2022-06-30 | True |
user_id |
---|
001 |
The main part of the query is a subquery that sums up the total and completed tasks for users over the last month with the help of GROUP BY and CASE WHEN clause. This subquery is then joined with the users table to include only active users. Finally, the WHERE clause filters out the users whose proportion of completed tasks over the total tasks is more than 50%.
Normalization involves breaking up your tables into smaller, more specialized ones and using primary and foreign keys to define relationships between them. Not only does this make your database more flexible and scalable, it also makes it easier to maintain. Plus, normalization helps to keep your data accurate by reducing the chance of inconsistencies and errors.
The only downside is now is that your queries will involve more joins, which are slow AF and often a DB performance botteleneck.
As a data analyst at Asana, you have been tasked to filter thorough the customer records database to find customers whose emails contain 'asana'. Filter only customers who registered in the year 2021.
customer_id | name | registration_date | |
---|---|---|---|
122 | John | john.asana@gmail.com | 01/20/2021 |
345 | Emily | emily.asana@yahoo.com | 12/01/2021 |
567 | David | david@gmail.com | 06/15/2021 |
654 | Laura | laura@yahoo.com | 05/10/2021 |
899 | Michael | asana.michael@hotmail.com | 03/05/2020 |
customer_id | name | registration_date | |
---|---|---|---|
122 | John | john.asana@gmail.com | 01/20/2021 |
345 | Emily | emily.asana@yahoo.com | 12/01/2021 |
Our goal is to fetch customer records in Postgre-SQL with 'asana' in their email and who registered in 2021.
In the given query, we use the keyword with the wildcard to filter email addresses that include the string 'asana' anywhere. We also use the function to get the year part of the registration date and compare it with 2021 to make sure we are only considering customers who registered in this year.
In SQL, a join retrieves rows from multiple tables and combines them into a single result set.
Four JOIN types can be found in SQL. For an example of each one, suppose you had a table of Asana orders and Asana customers.
INNER JOIN: Rows from both tables are retrieved when there is a match in the shared key or keys. An between the Orders and Customers tables would return only rows where the in the Orders table matches the in the Customers table.
LEFT JOIN: A retrieves all rows from the left table (in this case, the Orders table) and any matching rows from the right table (the Customers table). If there is no match in the right table, NULL values will be returned for the right table's columns.
RIGHT JOIN: A combines all rows from the right table (in this case, the Customers table) and any matching rows from the left table (the Orders table). If there is no match in the left table, NULL values will be displayed for the left table's columns.
FULL OUTER JOIN: A combines all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, NULL values will be displayed for the columns of the non-matching table.
Imagine that Asana wants to analyze user spending behavior on different projects. You are asked to gather data where each row represent an unique user, and the columns represent their average spending on different project types.
For this analysis, you have 2 tables: and . The table has columns and and the table has columns , and .
project_id | project_type |
---|---|
1 | Marketing |
2 | Development |
3 | Finance |
4 | Marketing |
5 | Finance |
user_id | project_id | purchase_amount |
---|---|---|
215 | 1 | 300 |
182 | 2 | 670 |
215 | 1 | 450 |
182 | 5 | 980 |
215 | 3 | 220 |
182 | 4 | 560 |
215 | 2 | 530 |
Your task is to write a SQL query that joins these tables and calculates the average spent per on each .
This SQL query first joins the 'purchases' and 'projects' tables on the 'project_id' column to combine purchase and project information into one table. Then it groups the data by 'user_id' and 'project_type', and for each of these groups, it calculates the average purchase amount with the AVG function. The result is a list showing the average spending of each user on each project type.
Because join questions come up frequently during SQL interviews, practice this Snapchat JOIN SQL interview question:
As a project management company, Asana would like to understand its most active users in terms of task updates on a weekly basis. Develop a query that will identify the top 3 users with the most task updates for each week.
Assume you are given a table named that represents any task updates, comments, or other activities a user performs on a task. Each record represents an activity from a user on a certain timestamp.
update_id | user_id | update_time | task_id |
---|---|---|---|
1542 | 346 | 2022-06-02 14:14:20 | 4856 |
9843 | 589 | 2022-06-03 09:28:48 | 8952 |
7496 | 346 | 2022-06-04 10:12:50 | 2567 |
7255 | 879 | 2022-06-08 22:05:44 | 8795 |
6164 | 589 | 2022-06-09 16:26:14 | 6857 |
1942 | 879 | 2022-06-10 10:55:26 | 3214 |
5324 | 346 | 2022-06-11 17:42:32 | 1524 |
9546 | 879 | 2022-06-14 21:16:40 | 1236 |
7382 | 346 | 2022-06-15 19:32:56 | 5124 |
1746 | 589 | 2022-06-16 08:12:00 | 6874 |
week | user_id | activity_count |
---|---|---|
22 | 346 | 3 |
22 | 589 | 2 |
23 | 879 | 2 |
23 | 346 | 2 |
The query first extracts the week number from by using the DATE_PART function. It then groups these weekly data by and computes the frequency of updates using COUNT function. The ORDER BY statement sorts the result in descending order by to get the users with the most updates. Finally, LIMIT 3 is used to retrieve only the top 3 users with the most updates each week.
The key to acing a Asana SQL interview is to practice, practice, and then practice some more! Besides solving the above Asana SQL interview questions, you should also solve the 200+ SQL questions on DataLemur which come from companies like Netflix, Google, and Amazon.
Each SQL question has hints to guide you, detailed solutions and most importantly, there's an interactive coding environment so you can right online code up your SQL query and have it graded.
To prep for the Asana SQL interview you can also be useful to solve SQL questions from other tech companies like:
In case your SQL foundations are weak, forget about diving straight into solving questions – go learn SQL with this free SQL tutorial.
This tutorial covers topics including CASE/WHEN statements and creating summary stats with GROUP BY – both of which come up often in SQL interviews at Asana.
In addition to SQL interview questions, the other topics to practice for the Asana Data Science Interview are:
To prepare for Asana Data Science interviews read the book Ace the Data Science Interview because it's got: