At ZipRecruiter, SQL is typically used for extracting pertinent candidate and job data, and analyzing recruitment trends for optimizing the job-matching algorithm. Because of this, ZipRecruiter almost always asks SQL coding questions in interviews for Data Science, Analytics, and & Data Engineering jobs.
Thus, to help you prep, here’s 11 ZipRecruiter SQL interview questions – can you solve them?
In the context of ZipRecruiter, a 'power user' might be a company or recruiter that posts jobs quite frequently or a job seeker who applies to jobs very frequently. For this case, let's focus on identifying companies that post the most number of jobs.
Imagine you are given a table indicating the jobs posted by various companies. The table has the following fields:
The question is: Write a SQL query to identify the top 5 companies that have posted the most number of jobs in the last 30 days.
job_id | company_id | posted_date |
---|---|---|
501 | 6 | 2022-09-01 |
502 | 3 | 2022-09-03 |
503 | 6 | 2022-09-03 |
504 | 2 | 2022-09-04 |
505 | 1 | 2022-09-05 |
506 | 1 | 2022-09-06 |
507 | 3 | 2022-09-07 |
508 | 2 | 2022-09-08 |
509 | 6 | 2022-09-09 |
In this query, we first filter out jobs that were posted in the last 30 days. Then, we count the number of jobs posted by each company using . Finally, we order the companies in descending order by the number of jobs posted and select the top 5. This gives us the 5 'power users' of ZipRecruiter's job posting service.
To practice a related customer analytics question on DataLemur's free interactive SQL code editor, try this recently asked Microsoft SQL interview question:
Imagine there was a table of ZipRecruiter employee salaries, along with which department they were in. Write a query to compare the average salary of employees in each department to the company's average salary for March 2024. Return the comparison result as 'higher', 'lower', or 'same' for each department. Display the department ID, payment month (in MM-YYYY format), and the comparison.
Write a SQL query for this problem interactively on DataLemur:
The answer is LONG – 30+ lines of SQL. You can find a step-by-step solution with hints here: Department vs. Company Salary.
The clause in SQL allows you to select records that are unique, eliminating duplicates.
For example, if you had a table of ZipRecruiter employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:
ZipRecruiter has a system where both employers and job applicants can rate each other after an interaction. You are given a dataset where each row represents a rating given by an employer to a job applicant after an interaction.
rating_id | employer_id | date_submitted | applicant_id | rating |
---|---|---|---|---|
9354 | 459 | 01/14/2022 00:00:00 | 35261 | 5 |
5019 | 122 | 01/18/2022 00:00:00 | 36298 | 3 |
7287 | 154 | 02/02/2022 00:00:00 | 35261 | 4 |
3462 | 752 | 02/26/2022 00:00:00 | 32351 | 3 |
7456 | 1098 | 03/03/2022 00:00:00 | 36298 | 2 |
In this table, is a unique identifer for each rating, is the identifier for employers, is when the rating was given, is the identifier for job applicants, is the score out of 5 given by the employer.
Write a SQL query to find out the average rating given by employers to each applicant for each month of the year.
month | applicant_id | avg_rating |
---|---|---|
1 | 35261 | 5.00 |
1 | 36298 | 3.00 |
2 | 35261 | 4.00 |
2 | 32351 | 3.00 |
3 | 36298 | 2.00 |
In this SQL query, we first extract the month from the date using the function. Then, we calculate the average rating for each applicant for each month using the function with a window defined by the clause. This gives us the average rating given by employers to each applicant for each month.
Pro Tip: Window functions are a frequent SQL interview topic, so practice all the window function problems on DataLemur
A correlated sub-query is a sub-query that depends on the outer query and cannot be executed separately. It uses the outer query to filter or transform data by referencing a column from it, and the outer query uses the results of the inner query. In contrast, a non-correlated sub-query is independent of the outer query and can be executed on its own. It does not reference any columns from the outer query and is used to retrieve additional data needed by the outer query.
Here is an example of a correlated sub-query:
This query selects the and total sales of all ZipRecruiter customers in the sales table whose are greater than the average of their own customer group. The sub-query in this case is correlated with the outer query, as it references the column from the outer query ().
Here is an example of a non-correlated sub-query:
The sub-query in this case is non-correlated, as it does not reference any columns from the outer query.
Performance-wise, correlated sub-queries are generally slower to execute, as they have to be re-evaluated for each row of the outer query, while non-correlated sub-queries are faster, as they only have to be executed once.
Given the table customer, your task is to write a SQL query to return all records of customers who live in "Los Angeles" or "New York", are younger than 30 years, and have a subscription status of "active". Note that all dates are in the format YYYY-MM-DD.
customer_id | name | birthday | city | subscription_status |
---|---|---|---|---|
01 | Mike | 1996-03-12 | Los Angeles | active |
02 | John | 1992-07-14 | San Francisco | active |
03 | Emma | 1994-04-08 | Los Angeles | inactive |
04 | Isabella | 1995-10-28 | New York | active |
05 | Jacob | 1985-05-06 | Los Angeles | active |
06 | Olivia | 1988-12-12 | New York | inactive |
07 | Ethan | 1993-01-22 | New York | active |
08 | Ava | 2002-11-11 | Los Angeles | active |
You can use the WHERE clause along with AND and OR operators to filter data. We also use the AGE function to calculate the age from the birthdate.
This query selects all records from "customers" where the city is either 'Los Angeles' or 'New York', the age calculated between the current date and their birthday is less than 30 years, and the subscription_status is 'active'. The keyword IN is used here to check both cities, which is more convenient than using multiple OR conditions. The AGE function calculates the age from the given birthday to the current timestamp ('now'), discarding any time parts.
An inner join only includes rows from both tables that satisfy the join condition, whereas a full outer join includes all rows from both tables, regardless of whether they match the join condition or not.
To demonstrate each kind, Imagine you were working on a Advertising Analytics project at ZipRecruiter and had two database tables: an table that contains data on Google Ads keywords and their bid amounts, and a table with information on product sales and the Google Ads keywords that drove those sales.
An retrieves rows from both tables where there is a match in the shared key or keys. For example, an INNER JOIN between the table and the table could be performed using the keyword column as the shared key. This would retrieve only the rows where the keyword in the table matches the keyword in the table.
A retrieves all rows from both tables, regardless of whether there is a match in the shared key or keys. If there is no match, values will be returned for the columns of the non-matching table.
Suppose you are a data analyst working at ZipRecruiter. Your team is interested in studying the behavior of users who view job listings and then proceed to apply. You are given two tables: , which records every time a user views a job listing, and , which records every time a user applies for a job. The question is to calculate the click-through conversion rate, which is the number of users who apply for a job after viewing it divided by the total number of users who viewed the job.
Here is an example of the and tables:
view_id | user_id | view_date | job_id |
---|---|---|---|
101 | 1 | 06/08/2022 00:00:00 | 50001 |
102 | 2 | 06/10/2022 00:00:00 | 50001 |
103 | 3 | 06/18/2022 00:00:00 | 50002 |
104 | 4 | 07/26/2022 00:00:00 | 50003 |
105 | 5 | 07/05/2022 00:00:00 | 50001 |
application_id | user_id | application_date | job_id |
---|---|---|---|
201 | 1 | 06/08/2022 00:00:00 | 50001 |
202 | 3 | 06/18/2022 00:00:00 | 50002 |
203 | 6 | 07/26/2022 00:00:00 | 50003 |
204 | 7 | 07/05/2022 00:00:00 | 50001 |
205 | 8 | 07/05/2022 00:00:00 | 50001 |
In PostgreSQL, you can solve this problem by joining the two tables on the and fields, and then calculating the click-through conversion rate. Here is a possible solution:
This query first joins the and tables so that each line represents a view and its corresponding application (if it exists). It then groups the data by , and calculates the click-through conversion rate by dividing the number of distinct users who applied for each job by the number of distinct users who viewed each job.
To practice a related problem on DataLemur's free online SQL code editor, attempt this Facebook SQL Interview question:
As a data analyst at ZipRecruiter, you're asked to find out the average number of job applications each company receives for their job postings in a month. This information will help our client companies gauge the competition and adjust their hiring strategies accordingly.
application_id | application_date | company_id | job_id |
---|---|---|---|
1532 | 01/10/2022 | 20001 | 10001 |
2721 | 01/12/2022 | 20002 | 20002 |
3182 | 01/09/2022 | 20001 | 10001 |
4019 | 01/15/2022 | 20002 | 20003 |
5431 | 01/31/2022 | 20003 | 20004 |
month | company_id | avg_applications |
---|---|---|
1 | 20001 | 2.00 |
1 | 20002 | 2.00 |
1 | 20003 | 1.00 |
This PostgreSQL query first extracts the month part of and groups by and . It then calculates average job applications for each company in each month using the AVG aggregate function and partitioning by company id. This gives us the average number of applications per month for each company.
One way to find duplicatesis to use a clause and then use to find groups
You could also use the operator:
You have been given a task to filter the customer records from . You need to find all the customers whose first name starts with 'J' and the last name ends with 'son'.
customer_id | first_name | last_name | sign_up_date | job_posted |
---|---|---|---|---|
001 | John | Jackson | 01/13/2020 | 3 |
002 | Mary | Johnson | 04/02/2022 | 12 |
003 | Jerry | Mason | 05/16/2019 | 7 |
004 | Sarah | Watson | 12/07/2021 | 8 |
005 | Jane | Edison | 09/30/2020 | 5 |
customer_id | first_name | last_name | sign_up_date | job_posted |
---|---|---|---|---|
001 | John | Jackson | 01/13/2020 | 3 |
003 | Jerry | Mason | 05/16/2019 | 7 |
005 | Jane | Edison | 09/30/2020 | 5 |
This SQL query selects all records from the table where the first name starts with 'J' and the last name ends with 'son'. A percentage (%) sign is used as a wildcard in SQL, 'J%' will match any string that begins with 'J', and '%son' will match any string that ends with 'son'.
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the above ZipRecruiter SQL interview questions, you should also solve the 200+ SQL coding questions which come from companies like Facebook, Google and unicorn tech startups.
Each DataLemur SQL question has hints to guide you, step-by-step solutions and crucially, there is an online SQL coding environment so you can right online code up your SQL query and have it checked.
To prep for the ZipRecruiter SQL interview you can also be a great idea to practice SQL problems from other tech companies like:
However, if your SQL query skills are weak, don't worry about jumping right into solving questions – improve your SQL foundations with this free SQL tutorial.
This tutorial covers SQL concepts such as grouping by multiple columns and aggreage functions like MIN()/MAX() – both of these pop up routinely in SQL job interviews at ZipRecruiter.
In addition to SQL interview questions, the other types of problems to practice for the ZipRecruiter Data Science Interview include:
First, learn more about the domain – I like LinkedIn's Data Science engineering blog for that! Then, go read the book Ace the Data Science Interview.
It covers 201 interview questions sourced from companies like Microsoft, Google & Amazon. It also has a crash course covering SQL, Product-Sense & ML. And finally it's vouched for by the data community, which is why it's earned 1000+ reviews on Amazon and a 4.5-star rating.