As we start to write longer SQL queries in the advanced SQL tutorial, it's important to cover Common Table Expressions (CTE) and subqueries which help us write more modular, reusable SQL code.
A CTE, short for Common Table Expression, is like a query within a query. With the statement, you can create temporary tables to store results, making complex queries more readable and maintainable. These temporary tables exist only for the duration of the main query, streamlining your analysis process.
Subqueries, also known as inner queries, are powerful tools to embed one query within another. By nesting queries within parentheses, you can generate temporary tables to perform calculations and filter data within the main query. Subqueries enable granular control over your data, enhancing the precision of your analysis.
CTEs and subqueries are both powerful tools in SQL that can be used to achieve similar goals, but they have different use cases and advantages. Here are some examples of when to use each:
Reusing Subquery Results: When you need to use the same subquery result multiple times within a larger query, CTEs can be used to prevent redundant calculations.
Recursive Queries: When you need to perform recursive queries, such as traversing hierarchical data like organizational structures or threaded discussions, CTEs are the ideal choice.
Here's an example of finding all employees under a certain manager:
Subqueries are often used with , , and operators to filter results based on conditions from another query.
Imagine you have two lists of data side by side. With correlated subqueries, you can ask questions about each item in one list and get answers from the other list that are specific to that item. It's like having a conversation between these two lists, where the second list provides unique details for each item in the first list.
Let's use CTE and subquery to find the top-selling artist within our dataset. You can follow through with the question card below.
As the lead data analyst for a prominent music event management company, you have been entrusted with a dataset containing concert revenue and detailed information about various artists.
Your mission is to unlock valuable insights by analyzing the concert revenue data and identifying the top-selling artists within each music genre.
Write a query to rank the artists within each genre based on their revenue per member and extract the top-selling artist from each genre.
Display the output of the artist name, genre, concert revenue, number of members, and revenue per band member, sorted by the highest revenue per member within each genre.
Column Name | Type | Description |
---|---|---|
artist_id | integer | A unique identifier for each artist or band performing in the concert. |
artist_name | varchar(100) | The name of the artist or band performing in the concert. |
genre | varchar(50) | The music genre associated with the concert. |
concert_revenue | integer | The total revenue generated from the concert. |
year_of_formation | integer | The year that the artist or band was formed. |
country | varchar(50) | The country of origin or residence of the artist or band. |
number_of_members | integer | The number of members in the band. |
album_released | integer | The total number of albums released by the artist or band. |
label | varchar(100) | The record label or music company associated with the artist or band. |
artist_id | artist_name | genre | concert_revenue | year_of_formation | country | number_of_members | album_released | label |
---|---|---|---|---|---|---|---|---|
103 | Taylor Swift | Pop | 700000 | 2004 | United States | 1 | 9 | Republic Records |
104 | BTS | K-Pop | 800000 | 2013 | South Korea | 7 | 7 | Big Hit Music |
105 | Adele | Pop | 600000 | 2006 | United Kingdom | 1 | 3 | Columbia Records |
109 | Blackpink | K-Pop | 450000 | 2016 | South Korea | 4 | 5 | YG Entertainment |
110 | Maroon 5 | Pop | 550000 | 1994 | United States | 5 | 7 | Interscope Records |
In both solutions, we calculate the revenue generated per band member for each concert and rank the artists within each genre based on their revenue per member.
Solution #1: Using CTE
Solution #2: Using Subquery
In both solutions, we use either CTE or subquery results to extract the top-selling artists in each music genre.
Solution #1: Using CTE
Solution #2: Using Subquery
The results reveal the top-selling artist from each genre, providing valuable insights into the artists' success in their respective music genres.
Interpreting the results:
artist_name | concert_revenue | genre | number_of_members | revenue_per_member |
---|---|---|---|---|
Beyonce | 750000 | R&B | 1 | 750000 |
Ariana Grande | 720000 | Pop | 1 | 720000 |
Eminem | 620000 | Hip-Hop | 1 | 620000 |
The Weeknd | 620000 | R&B/Pop | 1 | 620000 |
Panic! At The Disco | 530000 | Rock | 1 | 530000 |
SHINee | 470000 | K-Pop | 4 | 117500 |
So, in summary, CTEs and subqueries help you break down complicated SQL queries into smaller parts. This makes the code more readable and understandable, especially when you are dealing with multi-step data transformations or analyses. This modularity of SQL code is helpful with debugging because you can test each part of your query independently, making it easier to identify and correct issues.
Let's put your CTE & Subquery skills to the test a real SQL interview questions which use these concepts.
In this Microsoft SQL Interview question, try to use a CTE along with a LEFT JOIN and the DISTINCT keyword to find Azure customers that purchased at least one product from each product category.
In this Zomato SQL Interview question, try to use a subquery with a CASE statement to manage edge cases where the food itemβs order ID is equal to the total count of the items.
The next advanced SQL topic we'll cover are window functions, which frequently are used with CTEs.
Next Lesson
WINDOW FUNCTIONπͺ