# 8 Chegg SQL Interview Questions (Updated 2024)

Updated on

June 23, 2024

At Chegg, SQL used for analyzing user behavior insights to improve services and managing the vast educational content database. Because of this, Chegg LOVES to ask SQL questions during interviews for Data Science and Data Engineering positions.

Thus, to help you study, here's 8 Chegg SQL interview questions – can you answer each one?

## 8 Chegg SQL Interview Questions

### SQL Question 1: Analyze Monthly Average Ratings of Products

Chegg is interested in tracking how well their educational products are faring on a monthly basis. They're interested in looking at the performance of each product based on the average rating that it received in each month.

You have been given a "reviews" table with the following columns:

• (an ID uniquely identifying each review),
• (an ID uniquely identifying users),
• (the date when the users submitted their review),
• (Products ID's),
• (the rating the user gave for the product, on a 1-5 scale)

Write a SQL query to find the monthly average rating for each product. Round any averages to two digits after the decimal point.

##### Example Input:
review_iduser_idsubmit_dateproduct_idstars
61711232022-06-08500014
78022652022-06-10698524
52933622022-06-18500013
63521922022-07-26698523
45179812022-07-05698522
##### Example Output:
monthproductavg_stars
6500013.50
6698524.00
7698522.50

The SQL query uses the function to get the month from the and function to find the average stars given by the users. The function is used to round off the average value to two decimal places. The result is then grouped by and .

To practice a similar window function question on DataLemur's free online SQL code editor, try this Google SQL Interview Question:

### SQL Question 2: Well Paid Employees

Assume you had a table of Chegg employee salaries. Write a SQL query to find the employees who earn more than their own manager.

#### Chegg Example Input:

employee_idnamesalarydepartment_idmanager_id
1Emma Thompson38001
2Daniel Rodriguez2230110
3Olivia Smith800018
4Noah Johnson680028
5Sophia Martinez1750110
8William Davis70002NULL
10James Anderson40001NULL

#### Example Output:

employee_idemployee_name
3Olivia Smith

This is the output because Olivia Smith earns 8,000, surpassing her manager, William Davis who earns \$7,800.

Code your solution to this question and run your code right in DataLemur's online SQL environment:

First, we perform a SELF-JOIN where we treat the first table () as the managers' table and the second table () as the employees' table. Then we use a clause to filter the results, ensuring we only get employees whose salaries are higher than their manager's salary.

If the solution above is confusing, you can find a detailed solution with hints here: Employees Earning More Than Their Boss.

### SQL Question 3: Can you explain what a cross-join is and the purpose of using them?

A cross-join, also known as a cartesian join, is like a mad scientist's laboratory experiment gone wild. It takes two tables and mixes them together to create a crazy new table with every possible combination of rows from the original tables.

Here's an example:

If you have 20 products and 10 colors, that's 200 rows right there! Cross-joins are great for generating all possible combinations, but they can also create really big tables if you're not careful. Just like a mad scientist, use your powers wisely!

### SQL Question 4: Find the Average Monthly Ratings for Each Textbook

For the online text-book rental company Chegg, an important part of analyzing their products (the text-books) is understanding how textbooks are reviewed by its users. For this question, you are given a table with the following columns: , , , , .

Write a SQL query to find each textbook's average star rating for each month. The output should include the month, the book (represented by the ), and the average star rating for that book for the given month.

##### Example Input:
review_iduser_idsubmit_datebook_idstars
12741252019-01-05700014
78322302019-01-07700015
95013542019-01-15700022
12501952019-01-20700023
35919812019-02-04700014
65031372019-02-08700023
92355022019-02-14700015
72008242019-02-20700024
##### Example Output:
mthbook_idavg_stars
1700014.50
1700022.50
2700014.50
2700023.50

The above query extracts the month from the using the function and groups the result by this month and the book id. For each grouping (i.e., for each month and book), it calculates the average rating using the function rounded to 2 decimal places. And the result is sorted by month in ascending order and average stars in the descending order which means the textbooks with the higher ratings will appear in the front for each month.

### SQL Question 5: What's an index, and what are the different types?

A database index is a data structure that improves the speed of data retrieval operations on a database table.

There are few different types of indexes that can be used in a database:

• Primary index: a unique identifier is used to access the row directly.
• Unique index: used to enforce the uniqueness of the indexed columns in a table.
• Composite index: created on multiple columns of a table, is used to speed up the search process for multiple columns
• Clustered index: determines the physical order of the data in a table

For a concrete example, say you had a table of Chegg customer payments with the following columns:

Here's what a clustered index on the column would look like:

A clustered index on the column would determine the physical order of the records in the table based on the . This means that the records with the earliest values would be stored together physically in the table, followed by records with later payment_date values. This speeds up queries that filter or sort the data based on the , as the records are already physically sorted in the table. For example, if you want to retrieve all the payments made in the month of June, the database can use the clustered index to quickly locate and retrieve the desired records without having to search through the entire table.

### SQL Question 6: Filter Users Using LIKE

As a Data Analyst at Chegg, you are asked to gather data on users who have signed up to the website. Specifically, the Product Team wants to know how many users have an email from a specific domain, like 'gmail.com'. The users table has a column 'user_email' containing the user's email address.

You may assume table structure as follows:

##### Example Input:
101JohnD2304/03/2020john.doe@gmail.com
102JaneD3305/10/2021jane_doe@yahoo.com
103TomW1202/23/2020twilliams@hotmail.com
104SaraJ2106/14/2022sjohnson@gmail.com
105MikeG4511/18/2021gmichael@yahoo.com

Write a SQL query to find out the total number of users signed up with the email domain 'gmail.com'.

This query filters the 'users' table where the 'user_email' contains 'gmail.com', it counts the total number of records that satisfy this criterion. The '%' sign is a wildcard character that matches any sequence of characters in the email. Before 'gmail.com', it ensures that any email containing 'gmail.com' regardless of what comes before it, will be selected.

### SQL Question 7: Give a few ways in SQL that you can identify duplicate records in a table?

"One creative way is to use the window function ROW_NUMBER() and partition by whichver columns you are loooking for duplicates in. If any row has a row_number() more than 1, then it's a duplicate!

You could also use COUNT(DISTINCT col1) vs. COUNT(*) and see if they return different numbers, indicating the prescence of duplicates!

### SQL Question 8: Compute the average monthly expenditure of customers

Consider two tables in Chegg's database, and . The table contains the information about the customers including their ids, names, and location. The table contains all the information related to customer orders, including OrderID, CustomerID, ProductName, Quantity, and OrderDate.

Write a SQL query to find the average monthly expenditure of the customers by joining the Customers and Orders tables. The output should give the Customer's name, Month, and their average monthly expenditure.

##### Example Input:
CustomerIDNameLocation
1John DoeCalifornia
2Jane SmithTexas
3Mike DavisNew York
##### Example Input:
OrderIDCustomerIDProductNameQuantityPriceOrderDate
1111TestBook1210.0006/10/2022
2222TestBook2115.0006/12/2022
3331TestBook3125.0007/15/2022
4443TestBook4320.0007/25/2022
5552TestBook3225.0008/05/2022

This SQL query joins the Orders and Customers tables on the CustomerID. It then groups the data by the customer name and month (extracted from the Order date). Finally, it calculates the average monthly expenditure by multiplying the quantity of products ordered with their price, and then taking the average of this total cost.

Because joins come up routinely during SQL interviews, practice an interactive Spotify JOIN SQL question:

### Chegg SQL Interview Tips

The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Beyond just solving the above Chegg SQL interview questions, you should also solve the 200+ DataLemur interview questions which come from companies like Microsoft, Google, and Facebook.

Each DataLemur SQL question has multiple hints, detailed solutions and crucially, there's an interactive coding environment so you can instantly run your SQL query answer and have it executed.

To prep for the Chegg SQL interview it is also a great idea to solve SQL questions from other tech companies like:

But if your SQL query skills are weak, forget about going right into solving questions – go learn SQL with this SQL tutorial for Data Scientists & Analysts.

This tutorial covers SQL concepts such as window functions and handling missing data (NULLs) – both of which show up often during Chegg SQL assessments.

### Chegg Data Science Interview Tips

#### What Do Chegg Data Science Interviews Cover?

Besides SQL interview questions, the other topics to prepare for the Chegg Data Science Interview include:

#### How To Prepare for Chegg Data Science Interviews?

I'm a tad biased, but I think the best way to study for Chegg Data Science interviews is to read the book I wrote: Ace the Data Science Interview.

The book solves 201 interview questions sourced from FAANG, tech startups, and Wall Street. It also has a refresher on SQL, Product-Sense & ML. And finally it's helped thousands of people land their dream job in data, which is why it's got over 1000+ 5-star reviews on Amazon.