At Coursera, SQL is crucial for extracting data for course usage analytics and managing student data in the Learning Management System. That's the reason behind why Coursera often tests SQL query questions during interviews for Data Science, Data Engineering and Data Analytics jobs.
Thus, to help you prepare for the Coursera SQL interview, we've collected 8 Coursera SQL interview questions – able to answer them all?
Imagine we have a dataset of reviews for different courses on Coursera. We'd like to query this dataset in order to find the average score per course per month. Will you be able to create an SQL query that would yield the average (mean) review score (stars), per course (), per month, from this dataset?
We have the following example table:
review_id | user_id | submit_date | product_id | stars |
---|---|---|---|---|
1 | 123 | 2022-06-01 00:00:00 | 101 | 4 |
2 | 234 | 2022-06-20 00:00:00 | 102 | 3 |
3 | 345 | 2022-06-20 00:00:00 | 101 | 5 |
4 | 456 | 2022-07-10 00:00:00 | 102 | 2 |
5 | 567 | 2022-07-25 00:00:00 | 101 | 3 |
6 | 678 | 2022-07-28 00:00:00 | 102 | 5 |
To solve this, we will use the function to calculate the mean 'stars' and function to extract month from date. We will then group the rows by 'product_id' and month using .
This SQL statement would yield the average review score per course per month. The function formats the date to a string only displaying month and year. The clause groups the rows that have the same values in specified columns into aggregated data. In this case, it groups by 'product_id' and 'month'.
To practice another window function question on DataLemur's free interactive coding environment, try this Google SQL Interview Question:
As a Data analyst at Coursera, you are required to analyse the user activity of their platform. For each country, your task is to find out how many courses are taken by the users in the year 2022 until the end of September.
Use the following two tables for this task:
user_id | name | country |
---|---|---|
685 | Andrew | USA |
347 | Maria | Canada |
932 | Liam | USA |
487 | Rutvik | India |
291 | Emma | Canada |
course_id | user_id | course_name | start_date |
---|---|---|---|
731 | 685 | Data Science | 04/08/2022 |
894 | 347 | Machine Learning | 08/01/2022 |
662 | 932 | Computer Science | 06/10/2022 |
248 | 487 | Software Engineering | 03/19/2022 |
923 | 291 | Artificial Intelligence | 05/25/2022 |
This query will join the two tables on , and then filter for the courses that started during the year 2022 until September. It groups the remaining rows by country and counts the number of courses in each group. The final result will show how many courses were taken by users from each country during the specified timeframe.
In SQL, zero's are numerical values which can be used in calculations and comparisons just like any other number. A blank space, also known as an empty string, is a character value and can be used in character manipulation functions and comparisons.
NULLs aren't the same as zero's or blank spaces. NULLs represent unkonwn, missing, or not applicable values. They are not included in calculations and comparisons involving NULL values always result in NULL.
For Coursera, an online learning platform, the review system allows students to rate the courses they have taken. The question is: What is the average rating for each course on Coursera?
To answer this, we'll utilize the function in SQL, and consider two tables: and .
course_id | course_name | subject |
---|---|---|
101 | Python for Everybody | Programming |
102 | Data Science in R | R-Language |
103 | Machine Learning | Artificial Intelligence |
review_id | course_id | user_id | rating |
---|---|---|---|
1 | 101 | 200 | 5 |
2 | 101 | 201 | 4 |
3 | 102 | 202 | 3 |
4 | 102 | 203 | 2 |
5 | 103 | 204 | 3 |
6 | 103 | 205 | 4 |
The SQL query first joins the and tables via the . It then groups the data by and calculates the average for each group (course). The result will be a table having and their corresponding .
To practice a very similar question try this interactive Amazon Average Review Ratings Question which is similar for calculating average values using SQL or this Alibaba Compressed Mean Question which is similar for focusing on mean calculations.
hile a correlated subquery relies on columns in the main query's FROM clause and cannot function independently, a non-correlated subquery operates as a standalone query and its results are integrated into the main query.
An example correlated sub-query:
This correlated subquery retrieves the names and salaries of Coursera employees who make more than the average salary for their department. The subquery references the department column in the main query's FROM clause (e1.department) and uses it to filter the rows of the subquery's FROM clause (e2.department).
An example non-correlated sub-query:
This non-correlated subquery retrieves the names and salaries of Coursera employees who make more than the average salary for the Data Science department (which honestly should be very few people since Data Scientists are awesome and deserve to be paid well).The subquery is considered independent of the main query can stand alone. Its output (the average salary for the Data Science department) is then used in the main query to filter the rows of the Coursera employees table.
You've been given two tables: and . The table contains all reviews submitted, and the table contains all courses provided by Coursera.
For each course, find the average rating for the month of August 2022.
review_id | user_id | submit_date | course_id | stars |
---|---|---|---|---|
1211 | 768 | 08/06/2022 | 1001 | 4 |
2917 | 256 | 08/11/2022 | 2002 | 4 |
5012 | 963 | 08/18/2022 | 1001 | 3 |
3242 | 192 | 08/26/2022 | 3003 | 5 |
1417 | 191 | 08/05/2022 | 2002 | 2 |
course_id | course_name |
---|---|
1001 | Physics I |
2002 | Math 101 |
3003 | Intro to Programming |
month | course_name | avg_rating |
---|---|---|
8 | Physics I | 3.50 |
8 | Math 101 | 3.00 |
8 | Intro to Programming | 5.00 |
This query gets the month from the column in the table and the from the table. It then calculates the average rating for each course. The WHERE clause restricts the date range to August 2022. The results are grouped by both the and to give the average rating per course for August 2022.
In database schema design, a one-to-one relationship is when each entity is associated with only one instance of the other. For instance, a US citizen's relationship with their social-security number (SSN) is one-to-one because each citizen can only have one SSN, and each SSN belongs to one person.
A one-to-many relationship, on the other hand, is when one entity can be associated with multiple instances of the other entity. An example of this is the relationship between a person and their email addresses - one person can have multiple email addresses, but each email address only belongs to one person.
Query the Coursera database to obtain a monthly breakdown of the average rating for every course, by joining the table with the table. The results should be present for each month and each course.
review_id | user_id | submit_date | course_id | stars |
---|---|---|---|---|
1001 | 1 | 01/03/2022 | 101 | 5 |
1002 | 2 | 01/10/2022 | 101 | 4 |
1003 | 3 | 02/15/2022 | 102 | 5 |
1004 | 4 | 02/20/2022 | 102 | 4 |
1005 | 5 | 03/03/2022 | 101 | 5 |
course_id | course_name |
---|---|
101 | SQL for Beginners |
102 | Advanced SQL |
This query first extracts the month part from the submit_date column in the reviews table to get the month of every review. It then joins the table with the table based on the , ensuring every review gets associated with its respective course. The final step is to group the data by month and course name and calculate the average rating (stars) for every group. The results are then ordered by month and course name for better readability.
Since joins come up so often during SQL interviews, try this SQL join question from Spotify:
The best way to prepare for a SQL interview, besides making sure you have strong SQL fundamentals, is to practice a ton of real SQL questions that were asked in recent job interviews. Besides solving the earlier Coursera SQL interview questions, you should also solve the 200+ SQL questions from real Data Science & Analytics interviews which come from companies like Google, Facebook, Microsoft and Amazon.
Each exercise has multiple hints, step-by-step solutions and best of all, there is an interactive coding environment so you can instantly run your query and have it executed.
To prep for the Coursera SQL interview you can also be helpful to practice SQL problems from other tech companies like:
But if your SQL query skills are weak, forget about diving straight into solving questions – improve your SQL foundations with this interactive SQL tutorial.
This tutorial covers SQL topics like aggregate functions like SUM()/COUNT()/AVG() and cleaning text data – both of these come up frequently in Coursera SQL assessments.
In addition to SQL interview questions, the other types of questions tested in the Coursera Data Science Interview are:
The best way to prepare for Coursera Data Science interviews is by reading Ace the Data Science Interview. The book's got: