logo

11 Procore SQL Interview Questions (Updated 2024)

Updated on

March 2, 2024

At Procore, SQL is used quite frequently for analyzing construction project data for insights, and for BI reporting within the company. Because of this, Procore often tests SQL coding questions in interviews for Data Analytics, Data Science, and Data Engineering jobs.

To help you study for the Procore SQL interview, we've collected 11 Procore SQL interview questions – can you solve them?

11 Procore SQL Interview Questions

SQL Question 1: Identify Procore Power Users

For a construction management software company like Procore, a VIP user is likely one who not only logs into the app frequently, but who also actively creates and manages projects. We might define a "power user" as someone who logs into Procore at least 5 days a week and creates at least 2 projects per week.

The task would be to write a SQL query to identify these power users from Procore's user activity and project creation logs table and return their user ID, the total number of logins, and the total number of projects created.

Given the following tables:

Example Input:
log_iduser_idlogin_date
1001106/10/2022
1002106/11/2022
1003206/10/2022
1004106/12/2022
1005206/13/2022
Example Input:
project_iduser_idcreation_date
501106/10/2022
502206/10/2022
503106/11/2022
504206/13/2022
505106/13/2022
506106/12/2022
507106/13/2022

Answer:

For this problem, we will join two subqueries: One counting the unique login days per user and one counting the project creations per user.


This query fetches the users who logged in at least 5 unique days in the past week and those who created at least 2 projects in the past week. Only users who match both criteria (power users) are returned in the final result. Note: refers to the current date and time.

To solve a related customer analytics question on DataLemur's free interactive coding environment, try this Microsoft Azure Cloud SQL Interview Question: Microsoft SQL Interview Question: Super Cloud Customer

SQL Question 2: Analyze Monthly Average Project Ratings

Suppose you're given a dataset containing project feedback for the company Procore. The data are stored in a table titled "project_reviews" where each record indicates a review left by a user for a specific project. Each project review is identified by an unique review id and contains the user id, the date the review was submitted, the specific project id, and a rating out of 5 stars. Procore would like to analyze how its projects are performing monthly, so they need a SQL query that can calculate the average rating of each project per month.

Here's some example input data and expected output:

Example Input:

review_iduser_idsubmit_dateproject_idstars
17391232022-04-191113
18205202022-04-052225
19342452022-05-033334
20553402022-05-172222
21614562022-06-251114

Expected Output:

monthproject_idavg_rating
41113
42225
52222
53334
61114

SQL Answer:

You could use PostgreSQL's function to extract the month from the submit_date. Then, group by month and project_id, and calculate the average rating:


This query breaks down the average rating by project for each month. It also orders the result set by month and project ID so that it's easy to follow how each project is performing over time.

To solve a related window function SQL problem on DataLemur's free online SQL code editor, solve this Google SQL Interview Question: Google SQL Interview Question

SQL Question 3: What is denormalization, and in what situations might it be a useful?

Database denormalization is when you add redundancy to a database, and break typical normalization rules (specified by 1st, 2nd, 3rd normal forms). There's a few reasons to denormalize a database:

  • Improved performance: Denormalization often reduces the the number of costly join operations that are needed to retrieve data. This is helpful when the database is being used for OLAP (Online Analytical Processing) use cases, as joins can be expensive and slow.

  • Simplification: Denormalization can also be used to simplify the design of a database by reducing the number of tables and relationships that need to be managed. This can make it easier to understand and maintain the database.

  • Ease of use: Denormalization can also make it easier for users to work with a database by providing them with a more intuitive and straightforward data model.

While denormalization can be a useful tool for improving performance and scalability, it's important to keep in mind that it can make update and delete operations more complex. This is because denormalization can create data duplicates, which can make it harder to maintain data integrity. So, to avoid any headaches, it's a good idea to start with a well-normalized database design and then consider denormalization only if it's absolutely necessary for your specific performance and scalability needs. In other words, denormalization can be a bit of a wild card, so it's best to handle it with care!

Procore SQL Interview Questions

SQL Question 4: Project Resource Usage

Procore is a company that provides construction project management software. One of the challenging tasks they face is to track the resource usage in their ongoing projects. Let's assume that for each project, there are different types of resources like personnel, materials, and machines. The question is how would you design a relational database to efficiently keep track of usage of these various resources in each project? Furthermore, can you write a PostgreSQL query to find out the total number of hours each type of resource is utilized on a given project?

Sample Tables:

Example Input:
project_idproject_namestart_dateend_date
1Project Alpha06/01/202209/01/2022
2Project Beta07/01/202211/01/2022
3Project Gamma08/01/202212/01/2022
Example Input:
resource_type_idresource_Type
1Personnel
2Material
3Machine
Example Input:
resource_usage_idproject_idresource_type_idhours_used
111500
212300
313200
421800
522400
631600
733500
Example Output:
project_nameresource_typetotal_hours_used
Project AlphaPersonnel500
Project AlphaMaterial300
Project AlphaMachine200
Project BetaPersonnel800
Project BetaMaterial400
Project GammaPersonnel600
Project GammaMachine500

Answer:


This query joins the , , and tables on their respective ID fields. The result is grouped by project name and resource type, and the function is used to add up all the hours used for each combination. This gives us a comprehensive view of resource usage per project for each type of resource.

SQL Question 5: What distinguishes a left join from a right join?

A join in SQL allows you to retrieve data from multiple tables and combine it into a single set of results.

To demonstrate the difference between left vs. right join, imagine you had two database tables, an advertising campaigns table which had information on Google Ads keywords and how much was bid for each keyword, and a sales table, which has data on how many products were sold and which Google Ads keyword drove that sale.

: A LEFT JOIN retrieves all rows from the left table (in this case, the table) and any matching rows from the right table (the Sales table). If there is no match in the right table, values will be returned for the right table's columns.

: A RIGHT JOIN retrieves all rows from the right table (in this case, the Sales table) and any matching rows from the left table (the table). If there is no match in the left table, values will be returned for the left table's columns.

SQL Question 6: Average Project Duration at Procore

Given the table below, write a SQL query to calculate the average duration of projects in days for each project manager.

Example Input:
project_idstart_dateend_dateproject_manager_id
12012020-01-152020-05-22555
12022020-02-062020-06-12555
12032020-03-012020-06-25777
12042020-01-202020-05-30555
12052020-02-012020-06-10777

Answer:


This query calculates the duration of each project by subtracting from and then averages these durations for each project manager. This average gives the mean project duration for each manager in days.

To practice a very similar question try this interactive Facebook Average Post Hiatus (Part 1) Question which is similar for calculating duration between dates or this Microsoft Teams Power Users Question which is similar for grouping and ranking data.

SQL Question 7: When it comes to database normalization, what's the difference between 1NF, 2NF, and 3NF?

There are several normal forms that define the rules for normalizing a database:

A database is in first normal form (1NF) if it meets the following criteria:

  • Each column in a table contains a single value (no lists or containers of data)
  • Each column should contain the same type of data (no mixing strings vs. integers)
  • Each row in the table is unique

A database is in second normal form (2NF) if it meets the following criteria:

  • It is in first normal form.
  • All non-key attributes in a table are fully dependent on the primary key.

Said another way, to achieve 2NF, besides following all the rules from 1NF all the columns in a given table should be dependent only on that table's primary key.

A database is in third normal form (3NF) if it meets the following criteria:

  • It is in second normal form.
  • There are no transitive dependencies in the table.

A transitive dependency means that a piece of data in one column is derived from another column. For example, it wouldn't make sense to keep a column called "user's age" and "user's birthdate" (because age can be derived from birthdate.

While there's also a 4th and 5th normal form, it's too pedantic and hence out-of-scope to know for the Procore SQL interview.

SQL Question 8: Finding Customers Based on Email Domains

Procore is a company that deals with customers worldwide and has a huge customer records database. As a data analyst at Procore, your duty includes filtering out the customers based on their email domains. For this task, your responsibility is to write a SQL query that filters out the customers having an email domain of .

Example Input:
customer_idfirst_namelast_nameemailsign_up_date
6543JohnDoejohn.doe@gmail.com01/14/2022
4312MarySmithmary.smith@outlook.com02/25/2022
5523AliceJohnsonalice.johnson@hotmail.com03/14/2022
7123JamesWilsonjames.wilson@gmail.com04/05/2022
5432EmilyTayloremily.taylor@yahoo.com05/21/2022
Example Output:
customer_idfirst_namelast_nameemail
6543JohnDoejohn.doe@gmail.com
7123JamesWilsonjames.wilson@gmail.com

Answer:


This query searches in the 'email' field of the 'customers' table for any entries ending with '@gmail.com'. The '%' sign is a wildcard character that matches any sequence of characters. Thus, it filters out the customers who have a Gmail account.

SQL Question 9: Average Project Spending Per Customer

As a data analyst at Procore, your manager asks you to analyze the company's sales database. The sales database comprises two main tables and . The table contains the customer 'id', 'name', and 'company', while the 'project_spending' table contains 'customer_id', 'project_id', and 'spending'. Write a SQL query to find average project spending per customer and arrange the result by company name.

The tables are structured as follows:

Example Input:
idnamecompany
1John DoeABC Corp
2Jane SmithXYZ Inc
3Robert JohnsonABC Corp
4Emily DavisPQR Partners
Example Input:
customer_idproject_idspending
1100050000
2200030000
3300040000
1400060000
45000100000

Answer:


This query first performs an inner join on and tables. The join is performed on 'customer_id' from 'project_spending' table and 'id' from the 'customer_info' table. After joining the tables, it groups by 'company' and calculates the average 'spending' for each company. Finally, it arranges the results in order of 'company' name.

Example Output:
companyavg_spending
ABC Corp50000
PQR Partners100000
XYZ Inc30000

Because joins come up so often during SQL interviews, try an interactive Spotify JOIN SQL question: SQL join question from Spotify

SQL Question 10: What does the function do, and when would you use it?

The COALESCE() function can be used to replace NULL values with a specified value. For instance, if you are a Data Analyst at a company and are working on a customer analytics project to identify the most engaged customers, you may have access to the following data in the table:

customer_idemail_engagementsms_engagement
101moderately_activenot_opted_in
201un-subscribedNULL
301NULLnot_opted_in
401not_activevery_active
501very_activevery_active
601NULLNULL

Before you could procede with your analysis, you would need to remove the NULLs, and replace them with the default value for email engagement (not_active), and the default sms_engagement level (not_opted_in).

To do this, you'd run the following query:


This would get you the following output:

customer_idemail_engagementsms_engagement
101moderately_activenot_opted_in
102un-subscribednot_opted_in
103not_activenot_opted_in
104not_activevery_active
105very_activevery_active
106not_activenot_opted_in

SQL Question 11: Calculate The Total Costs Of Projects

Procore is a software company that provides cloud-based construction management software. Suppose the company is tracking details of projects in a table including data on the estimated and actual costs and wishes to know the cost deviation (percentage) of each project. The cost deviation is (actual_cost-estimated_cost)/estimated_cost*100. Write a SQL query to return the project name, the absolute value of the round-off of the cost deviation (rounded to two decimal places) and the square root of the actual cost. Assume the estimated cost and actual cost are in millions.

Example Input:
project_idproject_nameestimated_costactual_cost
1Hotel Construction109.2
2Mall Construction78.2
3Bridge Repair56.2
4Restaurant Interior32.9
5Highway Expansion1211.9
Example Output:
project_namecost_deviationsqrt_actual_cost
Hotel Construction8.003.03
Mall Construction17.142.86
Bridge Repair24.002.49
Restaurant Interior3.331.70
Highway Expansion0.833.45

Answer:


This SQL query first calculates the cost deviation by subtracting the estimated cost from the actual one and dividing it by the estimated cost. It then multiplies the result by 100 to get a percentage. It uses the absolute function to ensure the output is always positive. This represents the percentage of deviation from the estimated cost, regardless of whether it's over or under. The query also fetches the square root of the actual cost. Both the cost deviation and the square root of the actual cost are rounded off to two decimal places for optimal precision and readability.

To practice a very similar question try this interactive Wayfair Y-on-Y Growth Rate Question which is similar for requiring numerical calculation on datasets or this McKinsey 3-Topping Pizzas Question which is similar for needing to perform operations on costs.

Procore SQL Interview Tips

The key to acing a Procore SQL interview is to practice, practice, and then practice some more! Beyond just solving the above Procore SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Google, Facebook, Microsoft and Amazon. DataLemur SQL and Data Science Interview Questions

Each interview question has multiple hints, fully explained answers along with a discussion board to see how others solved it and most importantly, there's an online SQL coding environment so you can easily right in the browser your query and have it graded.

To prep for the Procore SQL interview you can also be useful to practice SQL questions from other tech companies like:

However, if your SQL skills are weak, forget about diving straight into solving questions – go learn SQL with this SQL tutorial for Data Scientists & Analysts.

DataLemur SQL Tutorial for Data Science

This tutorial covers SQL topics like handling timestamps and aggreage functions like MIN()/MAX() – both of which pop up often in Procore SQL assessments.

Procore Data Science Interview Tips

What Do Procore Data Science Interviews Cover?

In addition to SQL interview questions, the other topics covered in the Procore Data Science Interview are:

Procore Data Scientist

How To Prepare for Procore Data Science Interviews?

The best way to prepare for Procore Data Science interviews is by reading Ace the Data Science Interview. The book's got:

  • 201 Interview Questions from tech companies like Google & Microsoft
  • A Crash Course covering Python, SQL & ML
  • Amazing Reviews (900+ reviews, 4.5-star rating)

Acing Data Science Interview