Hold up, another join?! 😵💫
Yes, you guessed it right! After learning about the different joins in this tutorial, now it's time to learn self-joins!
Self-joins might sound complex, but they're not. It's just like a table joining hands with itself. Think of it as a table of Amazon transactions joining with another table of Amazon transactions. Don't worry, it's simpler than it seems! 😉
In this tutorial, we'll explore self-joins using the dataset in this Goodreads Books SQL question.
Imagine you're part of the Goodreads Books team and you're designing a book recommendation system that provides users with personalized book suggestions tailored to their preferences. In other words, if someone loves romance, you want to give them more romantic options!
Here's the query that generates meaningful book recommendations:
Here's the breakdown:
current_genre | current_book | suggested_book |
---|---|---|
Non-Fiction | Ace the Data Science Interview | Data Engineering Cookbook |
Non-Fiction | Ace the Data Science Interview | Building Data-Intensive Apps with Flask and SQLAlchemy |
Non-Fiction | Ace the Data Science Interview | Blink: The Power of Thinking Without Thinking |
Non-Fiction | Ace the Data Science Interview | The Power of Habit |
Non-Fiction | Ace the Data Science Interview | Storytelling with Data: A Data Visualization Guide for Business Professionals |
Cool, right? 🤓 You're suggesting awesome books to readers based on their genre preferences!
What if you want to suggest a second book? Well, just like that, we'll add another twist:
‼️ Hold on, though! Be cautious when dealing with large tables. Joining a table to itself can create a lot of matches, which might slow things down. In this case, we're talking about 2,886 rows, so we've got a trick up our sleeve—a clause:
Here's what's happening:
current_genre | current_book | suggested_book_1 | suggested_book_2 |
---|---|---|---|
Non-Fiction | Ace the Data Science Interview | Python for Data Analysis | Designing Data-Intensive Applications |
Non-Fiction | Ace the Data Science Interview | Python for Data Analysis | Data Engineering with Python |
Non-Fiction | Ace the Data Science Interview | Python for Data Analysis | Fundamentals of Data Engineering |
Non-Fiction | Ace the Data Science Interview | Python for Data Analysis | Educated: A Memoir |
Non-Fiction | Ace the Data Science Interview | Python for Data Analysis | Data Science for Business |
And there you have it! Self-joins in action—making book recommendations exciting and personal.
Now you're ready to test your knowledge of Self-Joins with this classic FAANG SQL interview question: Well-Paid Employees.
Suppose you had a table of salary data, where each row has the ID of an employee, their name, salary, department, and their manager's ID. Write a SQL query to identify all employees who earn more than their direct managers.
employee_id | name | salary | department_id | manager_id |
---|---|---|---|---|
1 | Emma Thompson | 3800 | 1 | 6 |
2 | Daniel Rodriguez | 2230 | 1 | 7 |
3 | Olivia Smith | 7000 | 1 | 8 |
4 | Noah Johnson | 6800 | 2 | 9 |
5 | Sophia Martinez | 1750 | 1 | 11 |
6 | Liam Brown | 13000 | 3 | NULL |
7 | Ava Garcia | 12500 | 3 | NULL |
8 | William Davis | 6800 | 2 | NULL |
employee_id | employee_name |
---|---|
3 | Olivia Smith |
The output shows that Olivia Smith earns $7,000, surpassing her manager, William David who earns $6,800. Try solving this problem:
Often joining two tables is synonymous with combining two tables. But did you know there is another way to combine tables WITHOUT USING A JOIN?
In the next tutorial, we'll think about joining tables VERTICALLY using , instead of HORIZONTALLY (which is how the joins we've covered so far work).
Spatially challenged, and confused by what we mean by vertical vs. horizontal? That's a good sign to dive into the next tutorial!
Next Lesson
SQL UNION 🙏