At ZoomInfo, SQL is used often for analyzing and cleaning their company and contact information databases. Unsurprisingly this is why ZoomInfo asks SQL problems during interviews for Data Science, Analytics, and & Data Engineering jobs.
As such, to help you practice for the ZoomInfo SQL interview, we've collected 9 ZoomInfo SQL interview questions – how many can you solve?
As part of ZoomInfo's data analytics team, you are asked to analyze user engagement data. Specifically, you are tasked to compute the monthly active users (MAU) using SQL. Monthly active user is defined as a unique user who has logged in or used the service at least once in a calendar month.
Given the following table of user activities:
activity_id | user_id | activity_date |
---|---|---|
1 | 1001 | 2022-01-01 |
2 | 1002 | 2022-01-05 |
3 | 1001 | 2022-01-15 |
4 | 1003 | 2022-02-01 |
5 | 1002 | 2022-02-20 |
6 | 1001 | 2022-02-25 |
7 | 1004 | 2022-03-01 |
8 | 1002 | 2022-03-15 |
9 | 1001 | 2022-03-25 |
10 | 1003 | 2022-03-30 |
Can you write a SQL query to generate a table which shows the number of unique active users for each month? The output table should be sorted in an ascending order by the month-year.
month_year | mau |
---|---|
2022-01 | 2 |
2022-02 | 3 |
2022-03 | 4 |
Here is an SQL query in PostgreSQL which uses window functions to solve this:
This query first truncates the to the month level, groups by this new field, and counts distinct 's in each group to get the monthly active users. The clause at the end sorts the resulting table in ascending order by .
For more window function practice, solve this Uber SQL problem on DataLemur's online SQL code editor:
ZoomInfo is a SaaS company that sells access to its database of business professionals for lead generation, focusing on providing accurate and detailed information about business professionals and companies.
Let’s assume that ZoomInfo needs to better manage its contacts raw data by creating two main tables: and . Here's the information they want to keep:
For each contact: ID, first name, last name, email, the company he/she works for, the contact's position, and phone number.
For each company: ID, name, industry, size, and location.
The relationship between contacts and companies is that a contact works for only one company, but companies can have multiple contacts.
Provide SQL database tables which can handle this situation in PostgreSQL.
contact_id | first_name | last_name | company_id | position | phone | |
---|---|---|---|---|---|---|
1201 | John | Doe | john.doe@example.com | 40001 | Manager | +111111111 |
1202 | Jane | Smith | jane.smith@example.com | 40002 | Director | +222222222 |
company_id | name | industry | size | location |
---|---|---|---|---|
40001 | Acme Corp | Manufacturing | 100-500 | New York |
40002 | Globex Corp | Technology | 500-1000 | San Francisco |
In the PostgreSQL database, we can create these two tables using the following commands:
The in the table is a foreign key referencing in the table. This enables us to link the contact to the company they work for and ensure database integrity.
The clause is used to filter rows from the result set of a , , or statement. It allows you to specify a condition that must be met for a row to be included in the result set.
The clause is used to filter groups created by the clause. It is similar to the clause, but it is used to specify conditions on the groups created by the clause, rather than on the individual rows of the table.
Say you were working on a social media analytics project for ZoomInfo.
Here is an example of a SQL query that you might write which uses both the WHERE and HAVING clauses:
This query retrieves the total impressions and average conversions for each platform in the table, the date of the campaign is in January 2023. The rows are grouped by platform and the clause filters the groups to include only those with more than 5000 impressions and an average conversion rate above 0.2.
Given a table with fields , (with values 'Active' or 'Inactive'), and , write a query to find all customers who have an 'Active' subscription status and have not logged into their account for at least 30 days. Assume today's date is '2022-07-30'.
customer_id | subscription_status | last_login_date | account_created_date |
---|---|---|---|
1001 | Active | 2022-06-29 | 2021-12-08 |
1012 | Inactive | 2022-07-29 | 2022-01-15 |
2021 | Active | 2022-07-01 | 2022-01-01 |
3031 | Active | 2022-05-01 | 2019-07-01 |
4012 | Inactive | 2022-03-01 | 2020-03-01 |
The PostgreSQL query for this would be:
This query filters the table to only include records where is 'Active' and is before '2022-06-30'. The resulting set of records will represent customers who are currently subscribed, but have not logged in within the last 30 days.
customer_id | subscription_status | last_login_date | account_created_date |
---|---|---|---|
1001 | Active | 2022-06-29 | 2021-12-08 |
3031 | Active | 2022-05-01 | 2019-07-01 |
Both the and window functions are used to access a row at a specific offset from the current row.
However, the function retrieves a value from a row that follows the current row, whereas the function retrieves a value from a row that precedes the current row.
Often, the offset for both functions is 1, which gives access to the immediately following/preceding row. Here's a SQL query example:
You have a database of all transactions made by clients of ZoomInfo's enterprise product. You would like to find out the average transaction value for each client.
Please, consider the following tables:
transaction_id | client_id | product_id | transaction_date | value |
---|---|---|---|---|
1 | 200 | ZI01 | 2022-08-14 | 500 |
2 | 200 | ZI02 | 2022-08-15 | 700 |
3 | 301 | ZI01 | 2022-08-15 | 1000 |
4 | 301 | ZI02 | 2022-09-15 | 500 |
5 | 203 | ZI01 | 2022-08-16 | 900 |
client_id | avg_transaction_value |
---|---|
200 | 600.00 |
301 | 750.00 |
203 | 900.00 |
This PostgreSQL query will group the table by the column and calculate the average value of transactions for each client. The output table shows each unique from the transactions table along with the average transaction value per client. This allows ZoomInfo to identify the clients spending the most on average and potentially identify opportunities for upselling or providing additional customer support.
Database normalization is a good idea to implement because it can reduce redundancy which can improve performance and database flexibility.
By dividing larger tables into smaller, more modular and specific tables which are linked via foreign keys, this can usually improve the speed of some queries because it minimizes the amount of random data the query has to sift through. However, it's not always 100% true that there is a query speed-up, because joins are an expensive operation.
Nonetheless, by reducing redundancy, besides improved performance you also get more database flexibility. By making tables more modular and reducing overlap, you're able to more easily, since one table can be changed more effortlessly without affecting others. This makes it easier to adapt the database schema to ZoomInfo's evolving business needs.
We have two relational databases: one for customers and another for transactions. In the database, each row represents a unique customer identified by a . This database also provides basic customer information such as , , and .
Meanwhile, the database keeps track of all purchases made by customers. It also lists the , the of the buyer, of the item, and .
Your task is to write a SQL query that lists all the customers who purchased a 'Pro' product from New York in the year 2020, and the total number of products they bought.
The query should return a new table that includes , , , , and .
customer_id | full_name | signup_date | plan_type | state |
---|---|---|---|---|
1 | John Doe | 2017-05-16 | Free | NY |
2 | Jane Smith | 2018-11-22 | Pro | NY |
3 | Mike Johnson | 2019-04-30 | Pro | CA |
4 | Emma Wilson | 2020-03-05 | Free | TX |
5 | Jack Davis | 2020-01-01 | Pro | NY |
transaction_id | customer_id | product_id | purchase_date |
---|---|---|---|
1001 | 1 | 500 | 2020-02-15 |
1002 | 2 | 500 | 2020-03-01 |
1003 | 2 | 510 | 2020-04-01 |
1004 | 5 | 510 | 2020-05-01 |
1005 | 5 | 520 | 2020-06-01 |
Here is the PostgreSQL query that would achieve this:
The query joins the customers and transactions tables using INNER JOIN on the common field, . It then filters the results for customers from New York who have a 'Pro' plan and made purchases in 2020. The clause groups the result by customer, counting the total number of purchases that each customer made during 2020. The result is ordered in descending order based on the total number of purchases. This helps to identify the top Pro customers from New York in 2020.
Since joins come up so often during SQL interviews, try this interactive Snapchat JOIN SQL interview question:
Imagine you are a data analyst at ZoomInfo and will access data relating to the webinars conducted by the company. The task is to find out the number of attendees per webinar per month for the year 2021. This will give the company an insight into which months had the most interactions and whether any specific webinar attracted more attendees.
Assume that you have access to a table which logs the details of every attendee of every webinar.
webinar_id | attendee_id | attend_date |
---|---|---|
001 | 1001 | 01/12/2021 00:00:00 |
001 | 1002 | 01/12/2021 00:00:00 |
002 | 1003 | 02/12/2021 00:00:00 |
003 | 1004 | 02/12/2021 00:00:00 |
003 | 1005 | 02/12/2021 00:00:00 |
mth | webinar_id | total_attendees |
---|---|---|
12 | 001 | 2 |
12 | 002 | 1 |
12 | 003 | 2 |
The PostgreSQL query is as follows:
This query uses the EXTRACT function to get the month from the attend_date field and the COUNT function to count the number of attendees for each webinar. It groups the results by month and webinar_id, and it orders the results by month and the total number of attendees in descending order. The WHERE clause is used to filter for only data from 2021.
The key to acing a ZoomInfo SQL interview is to practice, practice, and then practice some more! Beyond just solving the above ZoomInfo SQL interview questions, you should also solve the 200+ tricky sql questions on DataLemur which come from companies like Google, Microsoft and Silicon Valley startups.
Each DataLemur SQL question has multiple hints, full answers and best of all, there's an online SQL code editor so you can right in the browser run your SQL query answer and have it executed.
To prep for the ZoomInfo SQL interview you can also be useful to practice SQL questions from other tech companies like:
In case your SQL skills are weak, forget about going right into solving questions – go learn SQL with this free SQL tutorial.
This tutorial covers topics including 4 types of JOINS and filtering on multiple conditions using AND/OR/NOT – both of these come up often in SQL interviews at ZoomInfo.
In addition to SQL query questions, the other types of questions to practice for the ZoomInfo Data Science Interview are:
The best way to prepare for ZoomInfo Data Science interviews is by reading Ace the Data Science Interview. The book's got: