logo

11 Zillow SQL Interview Questions (Updated 2024)

Updated on

March 1, 2024

At Zillow, MySQL is used across the company for analyzing housing data from databases and generating product analytics insights. For this reason, Zillow almost always evaluates jobseekers on SQL coding questions during interviews for Data Science, Analytics, and & Data Engineering jobs.

As such, to help you study for the Zillow SQL interview, we've curated 11 Zillow SQL interview questions – able to answer them all?

SQL Interview Questions

11 Zillow SQL Interview Questions

SQL Question 1: Analyzing Monthly Average Listing Price for Properties in Zillow

Given a Zillow dataset containing historical listing information, write a SQL query to calculate and compare the monthly average listing prices for properties based in Seattle and San Francisco over last 6 months.

Consider the data in the table to be formatted as below:

Example Input:
listing_idcitylist_dateprice
1001Seattle2022-06-011500000
1002Seattle2022-07-021350000
1003Seattle2022-08-031250000
2001San Francisco2022-07-012250000
2002San Francisco2022-08-022350000
2003San Francisco2022-09-032300000

Answer:


In the query above, the function is used to truncate the list_date to a monthly precision. This is important because we are calculating monthly averages.

The function is used to calculate the average price per city per month.

The clause is a window function which is partitioning the data by city and by each month to calculate the average price.

The clause restricts the data to only include listings from Seattle and San Francisco and only for the last 6 months.

The clause sorts the result by the month and city for easy interpretation of results.

Pro Tip: Window functions are a popular SQL interview topic, so practice every window function problem on DataLemur

DataLemur SQL Questions

SQL Question 2: Housing Data Analysis

As an employee for Zillow, the online real estate marketplace, it's crucial to understand the data around housing listed on the platform from different sellers. This helps in having transparency, accountability and decision making.

Given a database that has two tables and , with each listing being linked to a seller:

The table has the following fields:

  • : The primary key for each listing entry
  • : The foreign key correlated to the seller's id in the sellers table
  • : The price for the particular housing listing
  • : The date the house was listed on Zillow
  • : Indicates if the property is sold or not

The table has:

  • : The id of the seller, matching the from the listings table
  • : The name of the person or entity that is selling the property on Zillow
Example Input:
listing_idseller_idsubmission_datelisting_priceis_sold
100120001/10/2022500000No
100220101/15/2022450000Yes
100320002/17/2022550000No
100420202/25/2022600000Yes
100520303/05/2022700000Yes
Example Input:
seller_idseller_name
200John Doe
201Jane Doe
202Bill Smith
203Sarah Johnson

Write a SQL query to list the name of every seller and the sum of the prices for all of their unsold listings.

Answer:

The SQL query would be framed as:


This query joins the and tables using the field. It then filters out all listings which are not sold and finally, groups the remaining entries by seller name and calculates the total unsold listing price for each seller. The output would provide us with each seller's name and their total unsold housing property prices on Zillow.

SQL Question 3: In the context of databases, what does atomicity, consistency, isolation, and durability mean?

ACID refers to the four key properties that are essential to the reliable and correct execution of database transactions. These properties are:

Atomicity: ensures that a transaction is treated as a single operation, and either all of the changes are made or none of them are! Basically, the database version of a "-FULL SEND-"

Consistency: ensures that the data is in a consistent state before and after a transaction is completed. For example, if wiring money to a friendly Nigerian prince whose fallen on hard times, consistency ensures that the total value of funds lost in my account is the same amount that's gained in the prince's account!

Isolation: ensures that the intermediate state of a transaction is invisible to other transactions. Back to the wiring-the-prince-some-money example, isolation ensures that another transaction sees the transferred funds in my account OR the princes, but not in both accounts at the same time

Durability: ensures that once a transaction has been completed successfully, the changes made by the transaction are permanent and cannot be undone, even in the event of a system failure. Basically, no taksies backsies (even if your system has a meltdown!).

Zillow SQL Interview Questions

SQL Question 4: Filtering Zillow Customer Data

Suppose you are a data analyst at Zillow. Your task is to filter the customer data for customers who have listed properties in Boston and Seattle in the past month, and their listing price was over $500,000. Write a SQL query to find the customer_id, name, city, listing_date, and property_price of such customers. Ignore listings that are currently inactive.

Let's create some sample data for the problem:

Example Input:
customer_idname
6171John Doe
7802Jane Smith
5293Emma Johnson
Example Input:
listing_idcustomer_idcitylisting_dateproperty_priceis_active
500016171Boston06/08/2022 00:00:005500001
698527802Seattle06/10/2022 00:00:006000001
500015293Boston06/18/2022 00:00:004000000
698525293Seattle07/26/2022 00:00:005000011
698527802Seattle07/05/2022 00:00:006500001

Answer:


With this query, we're joining the and table on . We are then filtering the records where the city is either 'Boston' or 'Seattle', the listing price is over $500,000, the listing date is in the past month, and the listing is currently active.

SQL Question 5: Can you explain the difference between a foreign and primary key in a database?

A primary key is a column (or set of columns) in a table that uniquely identifies each row in the table. It cannot contain null values and must be unique across all rows in the table.

A foreign key is a column (or set of columns) in a table that references the primary key of another table. It is used to establish a relationship between the two tables. A foreign key can contain null values, and multiple rows in the referencing table can reference the same row in the referenced table.

For example, consider a database with two tables: and . The Zillow customers table might have a primary key column called , while the Zillow orders table might have a foreign key column called that references the column in the table. This establishes a relationship between the two tables, such that each row in the orders table corresponds to a specific Zillow customer.

SQL Question 6: Compute the Average Listing Price Per City.

Given the table which contains data about each real estate listing, write a SQL query to find the average listing price per city.


Answer:


This query groups the listings by city (city_id) and then computes the average (AVG) price for each group. The AS keyword is used to rename the columns in the output.

To practice a very similar question try this interactive Amazon Highest-Grossing Items Question which is similar for requiring group by and aggregate functions or this LinkedIn Duplicate Job Listings Question which is similar for dealing with count of unique items in a category.

SQL Question 7: What does the constraint do?

A is a field in a table that references the of another table. It creates a link between the two tables and ensures that the data in the field is valid.

For example, if you have a table of Zillow customers and an orders table, the customer_id column in the orders table could be a that references the id column (which is the primary key) in the Zillow customers table.

The constraint helps maintain the integrity of the data in the database by preventing the insertion of rows in the table that do not have corresponding entries in the table. It also enforces the relationship between the two tables and prevents data from being deleted from the table if it is still being referenced in the table.

SQL Question 8: Calculate Click-Through Conversion Rates for Zillow Property Listings

Zillow.com often evaluates the effectiveness of their property listings by measuring the click-through conversion rates. These rates illustrate the progression from the initial stage of property viewing to the subsequent stage of property inquiry. Given property viewing log () and property inquiry log (), can you write a SQL query that computes the overall click-through conversion rate (from viewing a property to inquiring about it) for each property listed on the website?

Please consider the 'viewed_date' and 'inquiry_date' for each log. The click-through conversion rate can be calculated as the number of property inquiries per property view (in percentage form).

Here is an example of the logs.

Example Input:
viewed_iduser_idviewed_dateproperty_id
10014302022-07-151203
10024442022-07-151203
10034752022-07-161203
10044352022-07-151298
10054702022-07-161298
Example Input:
inquiry_iduser_idinquiry_dateproperty_id
40014302022-07-151203
40024762022-07-171203
40034442022-07-181203
40044702022-07-161298
40054402022-07-191298

Answer:


In this query, we join the and tables on the matching property and user details. We also ensure that the is earlier than or equal to the , which reflects a logical user activity sequence. Finally, we cluster the data by listing () and calculate the desired click-through conversion rate.

To practice a related SQL problem on DataLemur's free online SQL code editor, try this Facebook SQL Interview question: Facebook App CTR SQL Interview question

SQL Question 9: Find Customers Interested in Two-Bedroom Houses

Zillow is a famous real estate platform where customers can search for properties to buy, sell, or rent. Let's say Zillow wants to track customers who have interests in two-bedroom houses for its marketing campaign. Zillow wants a list of customers whose search queries include the term '2-bedroom' during the month of August.

You are given a table called 'customer_searches'. Columns in the table include: (which is unique), , , and . Note that can include extra spaces, be in upper case, lower case, or a combination.

Example Input:
search_idcustomer_idsearch_querysearch_date
1145"2-bedroom apartment san francisco"2022-07-01
2535"1-bedroom apartment NEW YORK"2022-08-05
3145" 2-bedroom Homes in los angeles "2022-08-09
4275"3 BHK apartments SAN JOSE"2022-08-16
5535"2-Bedroom apartments CHICAGO"2022-08-25
Example Output:
customer_id
145
535

Answer:


This SQL query selects distinct customers who had '2-bedroom' in their search query during the month of August. This is done by using the operator in PostgreSQL, which ignores case, i.e., '2-Bedroom' and '2-bedroom' would both match. Furthermore, it uses the function to extract the month of the search date.

SQL Question 10: What is the difference between the and window functions?

While both and are used to rank rows, the key difference is in how they deal with ties.

RANK(): When there's a tie, leaves a gap in the ranking. For example, if three rows are tied for 2nd place, the RANK() function will assign a rank of 2 to the first of these rows, a rank of 3 to the 2nd row in the tie, and a rank of 4 to the the 3rd tie.

DENSE_RANK(): For ties, does not leave a gap in the ranking. Instead, it assigns the same rank to all tied rows, and then makes the next row 1 bigger. Confusing, I know, but here's an example to make it more clear: if three rows are tied for 3rd place, the function will assign a rank of 3 to all three rows, and then assign a rank of 4 to the next row.

Suppose we had data on how many deals different salespeople at Zillow:


To rank these salespeople, we could execute the following query:


The result of this query would be:

namedeals_closedrankdense_rank
Akash5011
Brittany5021
Carlos4032
Dave4043
Eve3053
Farhad1064

As you can see, the function assigns a rank of 1 to the first row (Akash), a rank of 2 to the second row (Brittany), and so on. In contrast, the , gives both Akash and Brittany a rank of 1, since they are tied, then moves on to rank 2 for Carlos.

SQL Question 11: Analyze Customer and Property Data

A manager at Zillow asks you to analyze customer data. Zillow has two tables - the table and the table. The table contains detailed information about each customer, including their ID and the state they live in, while the table contains detailed information about every property listed for each state. Assume there's a one-to-many relationship between customers and properties, i.e. one customer can list multiple properties.

Your task is to write an SQL query to join the table with the table to get a list of all customers who have listed properties, with corresponding details from both tables. Information from the table will include property_id, price, and listing_date.

Example Input:

Example Input:

Answer:


This PostgreSQL query joins the table (aliased as c) with the table (aliased as p) on the . This will result in a tabular output that includes specific columns from both tables. In this case, we ordered the resulting data by to make it easier to see all properties listed by each customer. The INNER JOIN makes sure that only customers who have at least one property listed are included in the result table.

Since joins come up frequently during SQL interviews, try an interactive Spotify JOIN SQL question: SQL join question from Spotify

Preparing For The Zillow SQL Interview

The key to acing a Zillow SQL interview is to practice, practice, and then practice some more! Besides solving the above Zillow SQL interview questions, you should also solve the 200+ SQL exercises on DataLemur which come from companies like Amazon, Microsoft, Meta, and smaller tech companies. DataLemur SQL Interview Questions

Each DataLemur SQL question has multiple hints, fully explained answers along with a discussion board to see how others solved it and crucially, there is an interactive SQL code editor so you can instantly run your SQL query and have it executed.

To prep for the Zillow SQL interview it is also wise to solve interview questions from other tech companies like:

But if your SQL query skills are weak, don't worry about going right into solving questions – strengthen your SQL foundations with this DataLemur SQL Tutorial.

SQL interview tutorial

This tutorial covers topics including filtering strings based on patterns and advantages of CTEs vs. subqueries – both of which show up routinely during SQL job interviews at Zillow.

Zillow Data Science Interview Tips

What Do Zillow Data Science Interviews Cover?

Beyond writing SQL queries, the other topics to prepare for the Zillow Data Science Interview are:

Zillow Data Scientist

How To Prepare for Zillow Data Science Interviews?

The best way to prepare for Zillow Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from tech companies like Netflix, Google, & Airbnb
  • A Refresher covering SQL, Product-Sense & ML
  • Great Reviews (900+ reviews, 4.5-star rating)

Acing Data Science Interview

You should also familiarize yourself with different Zillow Data Science & ML use cases: ML use cases at Zillow