SQL JOINS Tutorial With Practice Exercises

So far in the SQL tutorial we've only been analyzing data from one table at a time. However, in the real world, companies have databases containing thousands of tables. To combine multiple tables, and analyze their data simultaneously, we can write a SQL query – the focus of this tutorial.

Spotify SQL JOIN Case Study

For example, let's pretend we work at Spotify where we have a table for albums, a table for artists, a table to represent users, and the list 🎡 goes on and on and on🎡 (sung to the tune of Don't Stop Believin').

Spotify Database Tables

As a Data Analyst or Data Scientist, you'll almost always be pulling data from multiple tables, which is why the SQL JOIN is crucial.

For example, at Spotify you might analyze Drake 🐐 by writing queries that combine two tables like:

  • the and table to analyze upcoming Drake concerts.
  • the and table to count the # of likes on the This Is Drake playlist.
  • the and table to rank Drake's top songs by play count.

Spotify Real-World Database JOINS Example

Now that you understand the real-world use case of a , let's write a specific SQL join query using Spotify data.

SQL JOIN Example

Suppose you work at Spotify and want to join information in the table with information in the table. Here's the schema for the two tables:

Example Input:

artist_idartist_namelabel_owner
101Ed SheeranWarner Music Group
120DrakeWarner Music Group
125Bad BunnyRimas Entertainment

Example Input:

song_idartist_idname
55511101Perfect
45202101Shape of You
22222120One Dance
19960120Hotline Bling

Notice how the two tables have the column in common:

  • In the table, you have information about each , like their artist name and what music label they work with.
  • in the table, you have information about each song, which includes which made that particular song.

Now, let's actually build our SQL JOIN query to combine the and table.

How To Write A SQL JOIN

There are three key parts to get right when writing a in SQL. First, you need to specify what columns to show. For our x example, we'll just use for that.

Secondly, you need to specify the names of the two tables we are joining. Since we are joining the with table, our query so far is as follows:


The third and final in a SQL JOIN query is the clause, which explains to the RDBMS (Relational Database Management System) how the two tables relate to each other.

We need to explicitly write in the SQL query that the column in the table matches up against the column in the table with this clause:


Putting these three parts together, we get the following SQL JOIN query:


The above query combines info from the and table into the result below:

artist_idartist_namelabel_ownersong_idartist_idname
101Ed SheeranWarner Music Group55511101Perfect
101Ed SheeranWarner Music Group45202101Shape of You
120DrakeWarner Music Group22222120One Dance
120DrakeWarner Music Group19960120Hotline Bling
125Bad BunnyRimas Entertainment12636125Mia

Now, let's practice writing a on a different dataset.

Easy SQL JOIN Practice Exercise

Suppose you work as a Data Scientist at the stock-trading app Robinhood. Assume you're given access to a table called which contains information about trades placed on the platform, and a table called which has information about a specific user.

Here's what the data looks like in both tables:

Example Input:

order_iduser_idpricequantitystatustimestamp
1001011119.8010Cancelled08/17/2022 12:00:00
10010211110.0010Completed08/17/2022 12:00:00
1002591485.1035Completed08/25/2022 12:00:00
1002641484.8040Completed08/26/2022 12:00:00
10030530010.0015Completed09/05/2022 12:00:00
1004001789.9015Completed09/09/2022 12:00:00
10056526525.605Completed12/19/2022 12:00:00

Example Input:

user_idcityemailsignup_date
111San Franciscorrok10@gmail.com08/03/2021 12:00:00
148Bostonsailor9820@gmail.com08/20/2021 12:00:00
178San Franciscoharrypotterfan182@gmail.com01/05/2022 12:00:00
265Denvershadower_@hotmail.com02/26/2022 12:00:00
300San Franciscohoustoncowboy1122@hotmail.com06/30/2022 12:00:00

Write a SQL query to join the and table.

The output should look something like this:

order_iduser_idquantitystatusdatepriceuser_idcityemailsignup_date
10010211110Completed08/17/2022 12:00:0010.00111San Franciscorrok10@gmail.com08/03/2021 12:00:00
10010111110Cancelled08/17/2022 12:00:009.80111San Franciscorrok10@gmail.com08/03/2021 12:00:00
10090014850Completed07/14/2022 12:00:009.78148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10025914835Completed08/25/2022 12:00:005.10148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10026414840Completed08/26/2022 12:00:004.80148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10077717860Completed07/25/2022 17:47:003.56178San Franciscoharrypotterfan182@gmail.com01/05/2022 12:00:00
..............................

Be sure to complete this SQL JOIN practice exercise, as it's meant to warm you up before you solve a real SQL interview question using the same data in the next section.

Harder SQL Join Interview Question

A Data Scientist interviewing at Robinhood was asked this SQL interview question where you need to write a SQL JOIN query to find the top three cities that have the highest number of completed trade orders.

Your output should look like this:

citytotal_orders
San Francisco3
Boston2
Denver1

Besides needing a , you'll also need and commands to solve this interview question.


4 Types of SQL Joins

So far in this tutorial, we've simply use the keyword, which is short for an . There are actually 4 different types of JOINS in SQL which we'll now cover in detail.

Type of JoinsDescription
INNER JOINReturns only the rows with matching values from both tables.
LEFT JOINReturns all the rows from the left table and the matching rows from the right table.
RIGHT JOINReturns all the rows from the right table and the matching rows from the left table.
FULL OUTER JOINReturns all rows when there is a match in either the left or the right table. If there is no match, values are returned for columns from the table without a match.

We'll be using the Goodreads Books dataset to demonstrate these different types of JOINs.

SQL INNER JOIN

An returns only the rows with matching values from both tables.

Picture this: you want to find out how many books priced at $20 and above have been ordered, and who the buyers are.


The returns only the rows where there is a matching book ID in both the and tables, focusing on books priced $20 and above.

SQL LEFT JOIN

Next, let's explore how the works using our and tables.

Suppose we want to retrieve all the orders along with their corresponding deliveries information. Here's the query:


With , all rows from the left table () are fetched, along with matching rows from the right table (). If there is no matching data in the right table, the result will still include the left table's data with values in the columns from the right table.

In our example, we're displaying a subset of 10 rows out of 20:

But what if we flip the tables? By swapping the positions of the tables in the , the same result can be achieved:


Notice how order IDs 2005, 2008, and 2010 are missing from the results? Since these order IDs are absent in the table, there are no corresponding matches in the table. This results in the values you see.

SQL RIGHT JOIN

A is the opposite of a . It returns all rows from the right table and pairs them with matching rows from the left table. If there is no match, the columns from the left table have values.

Run the following query:


You can also swap the positions of the tables, and you'll get the exact same results as a .

πŸ’‘A little tip: In the real world, is rarely used. Most people naturally think from left to right, making easier to understand and implement. So, if you find yourself in a situation where you might consider a , simply switch the positions of the tables and use a instead!

SQL FULL OUTER JOIN

Now, let's explore the which allows you to bring unmatched rows into the results.

Ever wondered what happens when both tables have something to offer, but they're not a perfect match?

A returns all rows when there is a match in either the left or the right table. If there is no match, values are returned for columns from the table without a match.


Running this query is like uniting two tables: and . Each row represents an attempt at a match. If there's no match, you'll see values in the columns from the table without a match.

The first 20 rows are expected (we're just showing you the first 5 rows):

Scroll down, and you'll encounter rows with order IDs (marked in the red box). The ensures that even if there isn't an to match these rows, they're still include in the results:

πŸ’‘ A little tip: Be careful with as it can introduce duplicate rows to your results without you realizing!

Conditional SQL JOINS

Conditional joins involve using single or multiple conditions using logical operators like and , or incorporating complex logical expressions such as ranges, pattern matching, or subqueries.

These conditions allow you to filter and retrieve data based on specific criteria during the join operation using the clause to specify conditions.

Example 1: Joining and table with a condition on quantity

Retrieve the list of book titles and their quantities ordered where the quantity ordered is more than 2.


What happens is:

  • This query applies the condition as part of the join condition itself. The join only includes rows where both and are true.
  • Since the filter condition is part of the join, no additional filtering is required after the join is performed.

Example 2: Joining and tables with a condition on

Retrieve order IDs and their corresponding delivery status where the delivery status is either 'Delivered' or 'Shipped'.


Example 3: Joining and with multiple conditions

Retrieve book titles, their authors, and the order dates for books released after 2015 and ordered in quantities greater than 1.


Example 4: Joining all three tables with a condition on and

Retrieve the book titles, their average ratings, order dates, and delivery statuses for books with a rating higher than 4.0 that have been delivered.


These examples illustrate how to use conditional joins to filter and retrieve specific data based on various conditions across multiple tables.

Handling Nulls + JOIN SQL Interview Question

Let's tackle a real Facebook/Meta SQL Interview which requires a JOIN along with some null handling.


SQL JOIN with CASE WHEN

Remember the last lesson on how we used CASE WHEN to segment Marvel Actors based on how active they were on social media:

CASE WHEN MARVEL ACTORS EXAMPLE

Let's combine the statement, along with a to solve this real Meta SQL interview question about segmenting Facebook advertisers based on their payment status:

Facebook SQL Interview Question

What's Next: Handling Dates & Timestamps

Now that we've learned to join tables, there's one last thing we need to cover in order to finish the intermediate SQL tutorial: handling dates & timestamps.


Next Lesson

DATE FUNCTIONS πŸ“…

Β© 2024 DataLemur, Inc

Career Resources

Free 9-Day Data Interview Crash CourseFree SQL Tutorial for Data AnalyticsUltimate SQL Interview GuideAce the Data Job Hunt Video CourseAce the Data Science InterviewBest Books for Data Analysts