At Intapp, SQL is used across the company for analyzing client databases in the professional services sector and manipulating data for product optimization and client-specific solutions. So, it shouldn't surprise you that Intapp asks SQL query questions during interviews for Data Analyst, Data Science, and BI jobs.
So, to help you practice for the Intapp SQL interview, we'll cover 10 Intapp SQL interview questions – able to solve them?
Intapp is a company that offers professional services firms, such as law, financial services, and consultancies, software solutions that streamline and enhance their business operations. These businesses subscribe to services on a monthly basis, and the VIP users are considered those who have a high usage of the platform, often measured by the quantity of services subscribed to and the frequency at which certain tasks are performed on the platform.
The "users" table captures all the users' data, while the "activity" table captures all the activity of these users on the platform. The activity table contains a timestamp indicating when the activity occurred.
A VIP user in the Intapp's context is defined as a user with more than 50 activities per month over the last three months. Write a SQL query to identify those users, assuming today's date is '2022-08-31'.
user_id | user_name | registration_date |
---|---|---|
102 | JohnDoe | 2022-04-29 |
203 | JaneSmith | 2021-09-15 |
301 | TomJohnson | 2022-01-12 |
468 | MaryBaker | 2022-05-22 |
581 | FredWilson | 2021-11-17 |
activity_id | user_id | activity_date |
---|---|---|
1187 | 102 | 2022-08-17 |
2258 | 203 | 2022-08-08 |
2214 | 203 | 2022-08-07 |
3126 | 301 | 2022-08-25 |
3715 | 468 | 2022-08-29 |
4256 | 581 | 2022-08-15 |
3437 | 203 | 2022-08-11 |
2745 | 301 | 2022-07-22 |
2278 | 102 | 2022-07-10 |
3264 | 581 | 2022-06-07 |
In this query, we first create a CTE () to count the number of activities per user per month. We then select users from the table that correspond to entries in the table with more than 50 activities. We use to ensure that a user has more than 50 activities for at least three separate months.
To practice a similar VIP customer analysis question on DataLemur's free interactive SQL code editor, try this Microsoft SQL Interview problem:
You are given a dataset that keeps track of all the interactions between Intapp and its clients. Each row in the dataset represents a client interaction with various information such as the date of interaction, Client_ID and various other parameters.
An interaction with a client is considered a new client interaction if there was no previous interaction with that specific client. Your task is to calculate the ratio of new clients to total clients per month.
interaction_id | client_id | interaction_date |
---|---|---|
1 | 101 | 2021-01-01 |
2 | 102 | 2021-01-02 |
3 | 101 | 2021-01-03 |
4 | 103 | 2021-01-03 |
5 | 104 | 2021-02-01 |
6 | 101 | 2021-02-02 |
7 | 105 | 2021-02-02 |
8 | 104 | 2021-02-03 |
month | year | new_client_ratio |
---|---|---|
January | 2021 | 1.00 |
February | 2021 | 0.50 |
To answer this question, we first identify the new clients by partitioning window function along with , then we use conditional to count new clients per month, and calculate the ratio.
In this solution, we first create a common table expression (CTE) that adds an is_new_client column for each row. This column uses window function to mark whether the row is the first interaction of a new client. Rows with is_new_client equal to TRUE are the interactions with the new clients.
We then count and calculate the ratio in the main query. The SUM(...) / COUNT(*) computes the ratio of newly acquired clients to total clients in each month.
To solve a similar window function question on DataLemur's free interactive coding environment, solve this Google SQL Interview Question:
hile a correlated subquery relies on columns in the main query's FROM clause and cannot function independently, a non-correlated subquery operates as a standalone query and its results are integrated into the main query.
An example correlated sub-query:
This correlated subquery retrieves the names and salaries of Intapp employees who make more than the average salary for their department. The subquery references the department column in the main query's FROM clause (e1.department) and uses it to filter the rows of the subquery's FROM clause (e2.department).
An example non-correlated sub-query:
This non-correlated subquery retrieves the names and salaries of Intapp employees who make more than the average salary for the Data Science department (which honestly should be very few people since Data Scientists are awesome and deserve to be paid well).The subquery is considered independent of the main query can stand alone. Its output (the average salary for the Data Science department) is then used in the main query to filter the rows of the Intapp employees table.
Intapp provides legal tech solutions to law firms. These law firms perform multiple case-related tasks each day. For this question, assume you have two tables: a table and a table. The table has information about each law firm with columns (primary key) and . The table has columns (primary key), (foreign key related to ), and .
The question is: How would you design a SQL query to find the average number of hours law firms are spending on tasks per week in the year 2022?
firm_id | firm_name |
---|---|
135 | Davis Law |
218 | Eagle Legal Services |
855 | Honest Lawyers |
task_id | firm_id | completed_date | hours_spent |
---|---|---|---|
21 | 135 | 01/03/2022 | 5 |
34 | 218 | 01/10/2022 | 7 |
50 | 855 | 01/17/2022 | 6 |
64 | 135 | 01/24/2022 | 4 |
85 | 218 | 01/31/2022 | 8 |
This query joins the and tables on and extracts only those tasks that were completed in the year 2022. It then groups the tasks by week and law firm, calculating the average () number of on tasks by each law firm per week. The results are ordered by and for easy reading. It is well-indexed and should be fairly performant even on large datasets.
To find records in one table that aren't in another, you can use a and check for values in the right-side table.
Here is an example using two tables, Intapp employees and Intapp managers:
This will return all rows from Intapp employees where there is no matching row in managers based on the column.
You can also use the operator in PostgreSQL and Microsoft SQL Server to return the records that are in the first table but not in the second. Here is an example:
This will retrieve all rows from employees that do not appear in managers. The operator works by retreivingthe rows that are returned by the first query, but not by the second.
Please note that is not supported by all DBMS systems, such as MySQL and Oracle (however, you can use the operator to achieve a similar outcome).
As a data analyst at Intapp, a global leader in legal software, you have been asked to find out the average time spent on tasks per project to better understand workflow efficiency. The firm uses a project management system that tracks the start time and end time of each task.
task_id | project_id | start_time | end_time |
---|---|---|---|
10001 | 501 | 2022-08-01 09:00:00 | 2022-08-01 11:00:00 |
10002 | 501 | 2022-08-02 14:00:00 | 2022-08-02 16:00:00 |
10003 | 502 | 2022-08-01 10:00:00 | 2022-08-01 12:00:00 |
10004 | 502 | 2022-08-04 13:00:00 | 2022-08-04 14:00:00 |
project_id | avg_hours |
---|---|
501 | 2.0 |
502 | 1.5 |
This query groups the tasks by then calculates the difference between and for each task and extracts the hours from it using the function. This will give us the time spent on each task in hours. It then calculates the average of these hours using the AVG function for each project. It finally outputs the project_id and the average time in hours spent on tasks for each project.
To practice a very similar question try this interactive Amazon Server Utilization Time Question which is similar for calculating total time from start and end times or this Snapchat Sending vs. Opening Snaps Question which is similar for deriving averages based on time spent.
Normalization involves breaking up your tables into smaller, more specialized ones and using primary and foreign keys to define relationships between them. Not only does this make your database more flexible and scalable, it also makes it easier to maintain. Plus, normalization helps to keep your data accurate by reducing the chance of inconsistencies and errors.
The only downside is now is that your queries will involve more joins, which are slow AF and often a DB performance botteleneck.
At Intapp, a technology company that provides cloud-based software solutions for professional services firms, we would like to know the average revenue generated per client based on successful transactions. We will be pulling data from two tables, and .
client_id | client_name | industry |
---|---|---|
1001 | ABC Law Firm | Legal |
1002 | XYZ Consultants | Consulting |
1003 | MNO Accountants | Accounting |
transaction_id | client_id | amount | status | transaction_date |
---|---|---|---|---|
2001 | 1001 | 1500 | Successful | 06/08/2022 |
2002 | 1001 | 2500 | Successful | 06/10/2022 |
2003 | 1002 | 3500 | Failed | 06/18/2022 |
2004 | 1003 | 4500 | Successful | 07/26/2022 |
2005 | 1003 | 3000 | Successful | 07/05/2022 |
We want to know the average revenue generated per client for the month of June, 2022.
The solution involves joining the and tables on , filtering for successful transactions that occurred in June 2022, and then grouping by to find the average amount of revenue per client.
As a data analyst at Intapp, you are given access to the company's client database and a project database. The client database table is named with columns for , , and . The project database table is named that includes columns for , , , , and .
Write a SQL query to calculate the total cost for each client per project and display the , and the .
client_id | client_name | industry_sector |
---|---|---|
101 | ACME Corp. | Manufacturing |
102 | Initech | Technology |
103 | Globex Corp. | Conglomerate |
104 | Umbrella Corp. | Biotech |
105 | Wonka Industries | Food |
project_id | client_id | project_name | start_date | end_date | cost |
---|---|---|---|---|---|
501 | 102 | Database Migration | 01/02/2022 | 01/30/2022 | 50000 |
502 | 101 | Process Automation | 02/01/2022 | 06/15/2022 | 150000 |
503 | 104 | Viral Research | 03/01/2022 | 08/31/2022 | 350000 |
504 | 101 | Inventory Management | 04/15/2022 | 09/30/2022 | 85000 |
505 | 105 | New Product Development | 05/01/2022 | 11/15/2022 | 200000 |
client_name | project_name | total_cost |
---|---|---|
ACME Corp. | Process Automation | 150000 |
ACME Corp. | Inventory Management | 85000 |
Initech | Database Migration | 50000 |
Umbrella Corp. | Viral Research | 350000 |
Wonka Industries | New Product Development | 200000 |
Here is the SQL query to solve this question:
In this query, we use a simple SQL JOIN statement to combine the and tables, where we join on (the common column in both tables). The query then selects the , , and (renamed to for clarity) from the combined table. Finally, the results are ordered by for easy viewing.
Because join questions come up routinely during SQL interviews, take a stab at this interactive Snapchat JOIN SQL interview question:
A UNIQUE constraint ensures that all values in a column are different. It is often used in conjunction with other constraints, such as NOT NULL, to ensure that the data meets certain conditions.
For example, if you had Intapp sales leads data stored in a database, here's some constraints you'd use:
In this example, the UNIQUE constraint is applied to the "email" and "phone" fields to ensure that each Intapp lead has a unique email address and phone number. This helps to ensure the integrity of the data in the database and prevents errors that could occur if two leads had the same email address or phone number.
The key to acing a Intapp SQL interview is to practice, practice, and then practice some more! Besides solving the earlier Intapp SQL interview questions, you should also solve the 200+ FAANG SQL Questions on DataLemur which come from companies like Google, Facebook, Microsoft and Amazon.
Each DataLemur SQL question has hints to guide you, step-by-step solutions and best of all, there is an interactive SQL code editor so you can right online code up your SQL query answer and have it executed.
To prep for the Intapp SQL interview you can also be useful to solve interview questions from other tech companies like:
But if your SQL skills are weak, forget about going right into solving questions – improve your SQL foundations with this free SQL tutorial.
This tutorial covers SQL concepts such as RANK vs. DENSE RANK and aggregate window functions – both of which come up frequently during SQL job interviews at Intapp.
Beyond writing SQL queries, the other question categories covered in the Intapp Data Science Interview are:
The best way to prepare for Intapp Data Science interviews is by reading Ace the Data Science Interview. The book's got: