logo

SQL JOINS Tutorial With Practice Exercises

So far in the SQL tutorial we've only been analyzing data from one table at a time. However, in the real world, companies have databases containing thousands of tables. To combine multiple tables, and analyze their data at the same time, we can write a JOIN SQL query – the focus of this tutorial.

Spotify SQL JOIN Case Study

For example, let's pretend we worked at Spotify, were we'd have a table for albums, a table for artists, a table to represent users, and the list 🎡 goes on and on and on🎡 (sung to the tune of Don't Stop Believin').

Spotify Database Tables

As a Data Analyst or Data Scientist, you'll almost always be pulling data in from multiple of the these tables, which is why the SQL JOIN is crucial.

For example at Spotify you might analyze Drake 🐐 by writing queries which combine two tables like:

  • the artists and concerts table, to analyze upcoming Drake concerts
  • the user_likes and playlists table, to count the # of likes on the This Is Drake playlist
  • the artists and songs table, to rank Drake's top songs by play count

Spotify Real-World Database JOINS Example

Now that you understand the real-world use case of a , let's write a specific SQL join query using Spotify data.

SQL JOIN Example

Suppose you worked at Spotify, and wanted to join information in the artists table with information in the songs table. Here's the schema for the two tables:

Example Input:

artist_idartist_namelabel_owner
101Ed SheeranWarner Music Group
120DrakeWarner Music Group
125Bad BunnyRimas Entertainment

Example Input:

song_idartist_idname
55511101Perfect
45202101Shape of You
22222120One Dance
19960120Hotline Bling

Notice how the two tables have the column in common:

  • in the table, you have information about each , like their artist name and what music label they work with
  • in the table, you have information about each song, which includes which made that particular song

Now, let's actually build our SQL JOIN query to combine the artists and songs table.

How To Write A SQL JOIN

There are three key parts to get right when writing a in SQL. First, you need to specify what columns to show. For our artists x songs example, we'll just use for that.

Secondly, you need to specify what are the names of the two tables we are joining. Since we are joining the artists with songs table, our query so far is as follows:


The third and final in a SQL JOIN query is the clause, which explains to the RDBMS how the two tables relate to each other.

We need to explicitly write in the SQL query that the column in the artists table matches up against the column in the songs table with this clause:


Putting these 3 parts together, we get the following SQL JOIN query:


The above query combines info from the artists AND songs table into the result below:

artist_idartist_namelabel_ownersong_idartist_idname
101Ed SheeranWarner Music Group55511101Perfect
101Ed SheeranWarner Music Group45202101Shape of You
120DrakeWarner Music Group22222120One Dance
120DrakeWarner Music Group19960120Hotline Bling
125Bad BunnyRimas Entertainment12636125Mia

Now, let's practice writing a on a different dataset.

Easy SQL JOIN Practice Exercise

Suppose you work as a Data Scientist at the stock-trading app Robinhood. Assume you're given access to a table called which contains information about trades placed on the platform, and a table called which has information about a specific user.

Here's what the data looks like in both tables:

Example Input:

order_iduser_idpricequantitystatustimestamp
1001011119.8010Cancelled08/17/2022 12:00:00
10010211110.0010Completed08/17/2022 12:00:00
1002591485.1035Completed08/25/2022 12:00:00
1002641484.8040Completed08/26/2022 12:00:00
10030530010.0015Completed09/05/2022 12:00:00
1004001789.9015Completed09/09/2022 12:00:00
10056526525.605Completed12/19/2022 12:00:00

Example Input:

user_idcityemailsignup_date
111San Franciscorrok10@gmail.com08/03/2021 12:00:00
148Bostonsailor9820@gmail.com08/20/2021 12:00:00
178San Franciscoharrypotterfan182@gmail.com01/05/2022 12:00:00
265Denvershadower_@hotmail.com02/26/2022 12:00:00
300San Franciscohoustoncowboy1122@hotmail.com06/30/2022 12:00:00

Write a SQL query to join the trades and users table.

The output should look something like this:

order_iduser_idquantitystatusdatepriceuser_idcityemailsignup_date
10010211110Completed08/17/2022 12:00:0010.00111San Franciscorrok10@gmail.com08/03/2021 12:00:00
10010111110Cancelled08/17/2022 12:00:009.80111San Franciscorrok10@gmail.com08/03/2021 12:00:00
10090014850Completed07/14/2022 12:00:009.78148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10025914835Completed08/25/2022 12:00:005.10148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10026414840Completed08/26/2022 12:00:004.80148Bostonsailor9820@gmail.com08/20/2021 12:00:00
10077717860Completed07/25/2022 17:47:003.56178San Franciscoharrypotterfan182@gmail.com01/05/2022 12:00:00
..............................

Be sure to complete this SQL JOIN practice exercise, as it's meant to warm you up before you solve a real SQL interview question using the same data in the next section.

Harder SQL Join Interview Question

A Data Scientist interviewing at Robinhood was asked this SQL interview question where you need to write a SQL JOIN query to find the top three cities that have the highest number of completed trade orders.

Your output should look like this:

citytotal_orders
San Francisco3
Boston2
Denver1

Besides needing a , you'll also need and commands to solve this interview question.


4 Types of SQL Joins

So far in this tutorial, we've simply use the keyword, which is short for an . There's actually 4 different types of JOINS in SQL, which we'll now cover in detail.

Type of JoinsDescription
INNER JOINReturns only the rows with matching values from both tables.
LEFT JOINReturns all the rows from the left table and the matching rows from the right table.
RIGHT JOINReturns all the rows from the right table and the matching rows from the left table.
FULL OUTER JOINReturns all rows when there is a match in either the left or the right table. If there is no match, values are returned for columns from the table without a match.

We'll be using the Goodreads Books dataset to demonstrate these different types of JOINs.

INNER JOIN

Picture this: you want to find out how many books priced at $20 and above have been ordered, and who the buyers are.


The returns only the rows where there is a matching book ID in both the and tables, focusing on books with prices $20 and above.

LEFT JOIN

Next, let's explore how the works using our and tables.

Suppose we want to retrieve all the orders along with their corresponding deliveries information. Here's the query:


With , all rows from the left table () are fetched, along with matching rows from the right table (). If there is no matching data in the right table, the result will still include the left table's data, with values in the columns from the right table.

In our example, we're displaying a subset of 10 rows out of 20:

But what if we flip the tables? By swapping the positions of the tables in the , the same result can be achieved:


Notice how order IDs 2005, 2008, and 2010 are missing from the results? Since these order IDs are absent in the table, there are no corresponding matches in the table. This results in the values you see.

RIGHT JOIN

A is the opposite of a . It returns all rows from the right table and pairs them with matching rows from the left table. If there is no match, the columns from the left table have values.

Run the following query:


You can also swap the positions of the tables, and you'll get the exact same results as a .

πŸ’‘A little tip: In the real world, is rarely used. Most people naturally think from left to right, making easier to understand and implement. So, if you find yourself in a situation where you might consider a , simply switch the positions of the tables and use a instead!

FULL OUTER JOIN

Now, let's explore the which allows you to bring unmatched rows into the results.

Ever wondered what happens when both tables have something to offer, but they're not a perfect match?

A returns all rows when there is a match in either the left or the right table. If there is no match, values are returned for columns from the table without a match.


Running this query is like uniting two tables: and . Each row represents an attempt at a match. If there's no match, you'll see values in the columns from the table without a match.

The first 20 rows are expected (we're just showing you the first 5 rows):

Scroll down, and you'll encounter rows with order IDs (marked in the red box). The ensures that even if there isn't an to match these rows, they're still include in the results:

πŸ’‘ A little tip: Be careful with as it can introduce duplicate rows to your results without you realizing!

Handling Nulls + JOIN SQL Interview Question

Let's tackle a real Facebook/Meta SQL Interview which requires a JOIN along with some null handling.


What's Next: Handling Dates & Timestamps

Now that we've learned to join tables, there's one last thing we need to cover in order to finish the intermediate SQL tutorial: handling dates & timestamps.


Next Lesson

DATE FUNCTIONS πŸ“…