Data Science, Data Engineering, and Data Analytics employees at Graybar use SQL to analyze sales trends accross different electrical products, helping them see which items are selling well and which ones need more attention. They also rely on SQL to manage their large inventory database, ensuring that products are distributed smoothly and efficiently, this is why Graybar includes SQL questions during interviews.
So, to help prep you for the Graybar SQL interview, we've collected 10 Graybar Electric SQL interview questions in this blog.
Graybar, being a Fortune 500 wholesale distributor, carries a vast number of products. Each product has a unique product id, and numerous customers review products every day. The reviews are stored in a table named "reviews".
Can you write a SQL query that calculates the average reviews per product for each month?
The table has these columns: (integer), (integer), (timestamp), (integer), and (integer - given stars by the user, between 1 and 5).
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
1 | 123 | 2021-05-20 00:00:00 | 50001 | 4 |
2 | 265 | 2021-05-25 00:00:00 | 69852 | 4 |
3 | 362 | 2021-05-25 00:00:00 | 50001 | 3 |
4 | 192 | 2021-06-20 00:00:00 | 69852 | 3 |
5 | 981 | 2021-06-25 00:00:00 | 69852 | 2 |
The SQL query leverages the function to truncate to month. This results in grouping the data on a per-month, per-product basis. The function is used to calculate the average number of stars (reviews) for each product, per month. To maintain precision while displaying the average, function's output is type-casted to DECIMAL with precision 3 (allowing a maximum of 999 average stars, although in practical use stars should be between 1 and 5), and scale 2 (allowing two decimal places for more detailed averages).
For more window function practice, try this Uber SQL problem on DataLemur's interactive SQL code editor:
Assume you had a table of Graybar employee salary data. Write a SQL query to find the top 3 highest paid employees in each department.
employee_id | name | salary | department_id |
---|---|---|---|
1 | Emma Thompson | 3800 | 1 |
2 | Daniel Rodriguez | 2230 | 1 |
3 | Olivia Smith | 2000 | 1 |
4 | Noah Johnson | 6800 | 2 |
5 | Sophia Martinez | 1750 | 1 |
8 | William Davis | 6800 | 2 |
10 | James Anderson | 4000 | 1 |
department_id | department_name |
---|---|
1 | Data Analytics |
2 | Data Science |
department_name | name | salary |
---|---|---|
Data Analytics | James Anderson | 4000 |
Data Analytics | Emma Thompson | 3800 |
Data Analytics | Daniel Rodriguez | 2230 |
Data Science | Noah Johnson | 6800 |
Data Science | William Davis | 6800 |
You can solve this interview question and run your code right in DataLemur's online SQL environment:
We use the DENSE_RANK() window function to generate unique ranks for each employee's salary within their department, with higher salaries receiving lower ranks. Then, we wrap this up in a CTE and filter the employees with a ranking of 3 or lower.
If the solution above is tough, you can find a step-by-step solution with hints here: Top 3 Department Salaries.
If you want to return records with no duplicates, you can use the keyword in your statement.
For example, if you had a table of Graybar employees, and wanted to see what are all the unique job titles that currently worked at the company, you could write the following query:
If had the following data:
f_name | job_title |
---|---|
Akash | Data Analyst |
Brittany | Data Scientist |
Carlos | Data Engineer |
Diego | Data Engineer |
Eva | Data Analyst |
Then the output from the query would be:
job_title |
---|
Data Analyst |
Data Scientist |
Data Engineer |
Graybar, a Fortune 500 corporation, is a leader in the distribution of high quality electrical, communications, and data networking products and a provider of related supply chain management services.
Assume that you have been given two tables, and .
The table includes information about each product such as Product ID, Product Name and Product Category.
The table includes information about each sale transaction such as the Sale ID, Product ID (which product was sold), Customer ID (who purchased the product) and the Sale Date.
You need to analyze the sales data for Graybar's product categories. Your task is to write a SQL query that can help Graybar identify the top 5 product categories based on the total sales count in the year 2022.
product_id | product_name | product_category |
---|---|---|
1 | Product_1 | Category_1 |
2 | Product_2 | Category_2 |
3 | Product_3 | Category_3 |
4 | Product_4 | Category_1 |
5 | Product_5 | Category_2 |
sale_id | product_id | customer_id | sale_date |
---|---|---|---|
1 | 1 | 10 | 01/05/2022 |
2 | 2 | 20 | 02/08/2022 |
3 | 3 | 30 | 03/20/2022 |
4 | 1 | 40 | 04/12/2022 |
5 | 5 | 50 | 05/30/2022 |
The above PostgreSQL command would join the and table using the as a join key. It then groups the results by the and counts the total sales for each category for the year 2022. At last, it orders the result by in descending order and limits the result to the top 5 product categories.
A constraint ensures that all values in a column are different. It is often used in conjunction with other constraints, such as , to ensure that the data meets certain conditions.
For example, if you had Graybar employee data stored in a database, here's some constraints you'd use:
In the Graybar employee example, the constraint is applied to the "email" field to ensure that each employee has a unique email address. This helps to ensure the integrity of the data in the database and prevents errors that could occur if two employees had the same email address.
Given a customer records database, write a query to filter customers who made a purchase in the last 30 days and spent more than $500. These customers should be either from the 'Construction' industry or the 'Industrial' industry. Additionally, their preferred mode of contact should not be 'Email'.
Here's some sample data for this question:
customer_id | last_purchase_date | total_spend | industry | preferred_contact |
---|---|---|---|---|
0010 | 06/20/2022 | 600 | Construction | |
0082 | 07/10/2022 | 700 | Industrial | Phone |
0512 | 06/27/2022 | 450 | Construction | Phone |
0345 | 07/15/2022 | 550 | Construction | |
1290 | 07/05/2022 | 670 | Industrial | Phone |
Here is a PostgreSQL query that can be used to filter the customers based on the conditions mentioned.
This solution uses the clause to filter the customer records. It uses the 'AND' and 'IN' SQL keywords to combine multiple conditions. The condition checks if the last purchase was made in the last 30 days. The condition checks if the total spend is more than $500. The condition checks if the customer is from the 'Construction' or 'Industrial' industry. The condition excludes customers whose preferred contact method is email. The '!=' operator is used to denote 'not equal to'.
Here is an example of a clustered index on the column of a table of Graybar customer transactions:
This creates a clustered index on the column, which will determine the physical order of the data rows in the table.
Here is an example of a non-clustered index on the column of the same table:
This will create a non-clustered index on the column, which will not affect the physical order of the data rows in the table.
In terms of query performance, a clustered index is usually faster for searches that return a large number of records, while a non-clustered index is faster for searches that return a small number of records. However, updates to a clustered index are slower, as they require the data rows to be physically rearranged, while updates to a non-clustered index are faster, as they only require the index data structure to be updated.
As an assistant at Graybar, a leading distributor of electrical, communications, and data networking products, you're asked to analyze the sales data. For every month, find the product with the maximum total sales amount.
sales_id | product_id | sale_date | sale_amount |
---|---|---|---|
2312 | 765 | 01/12/2022 | $100 |
1743 | 432 | 01/15/2022 | $200 |
2791 | 765 | 01/20/2022 | $150 |
9273 | 396 | 01/25/2022 | $140 |
6352 | 765 | 02/03/2022 | $170 |
8345 | 432 | 02/08/2022 | $230 |
9422 | 396 | 02/11/2022 | $210 |
month | year | max_sale_product_id | max_sale_sum |
---|---|---|---|
1 | 2022 | 765 | $250 |
2 | 2022 | 432 | $230 |
This query first groups the sales by , , and and calculates the total sale amount for each group. Then, for each month and year, it finds the maximum total sale amount. The final statement joins these two intermediate results on , , and . The result is the with the maximum total sales for each month and year.
Graybar is a Fortune 500 company, specializing in supply chain management services, and is one of the largest North American commercial lighting suppliers. Your task is to filter down their customer records database and find records of customers whose first names start with 'J' and live in 'New York'. Use the SQL keyword LIKE for this problem.
customer_id | first_name | last_name | city | state |
---|---|---|---|---|
121 | John | Smith | San Francisco | California |
342 | Jane | Doe | New York | New York |
465 | James | Brown | New York | New York |
592 | Joe | Davis | Austin | Texas |
620 | Jack | White | Chicago | Illinois |
765 | Julie | Moon | New York | New York |
856 | Jasmine | Blue | Los Angeles | California |
customer_id | first_name | last_name | city | state |
---|---|---|---|---|
342 | Jane | Doe | New York | New York |
465 | James | Brown | New York | New York |
765 | Julie | Moon | New York | New York |
The SQL query to solve this task would look as follows:
This query uses the SQL operator in conjunction with the '%' wildcard to filter for customers whose first names start with 'J'. It also uses the AND operator to further filter for customers who live in New York. This query will return a table with columns for , , , , and for any customer who matches both conditions.
The / operator is used to remove to return all rows from the first statement that are not returned by the second statement.
Note that is available in PostgreSQL and SQL Server, while is available in MySQL and Oracle (but don't stress about knowing which DBMS supports what exact commands since the interviewers at Graybar should be lenient!).
Here's a PostgreSQL example of using to find all of Graybar's Facebook video ads with more than 10k views that aren't also being run on YouTube:
If you want to retain duplicates, you can use the operator instead of . The operator will return all rows, including duplicates.
The best way to prepare for a Graybar SQL interview is to practice, practice, practice. Besides solving the above Graybar SQL interview questions, you should also solve the 200+ DataLemur SQL Interview Questions which come from companies like Microsoft, Google, Amazon, and tech startups.
Each problem on DataLemur has multiple hints, fully explained answers along with a discussion board to see how others solved it and crucially, there is an interactive coding environment so you can instantly run your SQL query answer and have it graded.
To prep for the Graybar SQL interview it is also useful to practice SQL problems from other industrial and electrical distribution companies like:
In case your SQL foundations are weak, don't worry about jumping right into solving questions – strengthen your SQL foundations with this free SQL for Data Analytics course.
This tutorial covers SQL topics like Union vs. UNION ALL and filtering data with WHERE – both of these pop up routinely during Graybar SQL interviews.
Beyond writing SQL queries, the other types of problems tested in the Graybar Data Science Interview are:
To prepare for Graybar Data Science interviews read the book Ace the Data Science Interview because it's got:
Don't ignore the behavioral interview – prep for it with this guide on acing behavioral interviews.