CTE vs. SUBQUERY

As we start to write longer SQL queries in the advanced SQL tutorial, it's important to cover Common Table Expressions (CTE) and subqueries which help us write more modular, reusable SQL code.

What is a CTE?

A CTE, short for Common Table Expression, is like a query within a query. With the statement, you can create temporary tables to store results, making complex queries more readable and maintainable. These temporary tables exist only for the duration of the main query, streamlining your analysis process.

What is a Subquery?

Subqueries, also known as inner queries, are powerful tools to embed one query within another. By nesting queries within parentheses, you can generate temporary tables to perform calculations and filter data within the main query. Subqueries enable granular control over your data, enhancing the precision of your analysis.

When Do You Use a CTE or a Subquery?

CTEs and subqueries are both powerful tools in SQL that can be used to achieve similar goals, but they have different use cases and advantages. Here are some examples of when to use each:

Advantages of Using a CTE

  1. Break Down Complex Queries: By declaring CTEs at the beginning of the query, you enhance code readability, enabling a clearer grasp of your analysis logic. Breaking down the query into smaller, more manageable components fosters effortless code maintenance and enhances comprehension.

  1. Reusing Subquery Results: When you need to use the same subquery result multiple times within a larger query, CTEs can be used to prevent redundant calculations.

  2. Recursive Queries: When you need to perform recursive queries, such as traversing hierarchical data like organizational structures or threaded discussions, CTEs are the ideal choice.

Here's an example of finding all employees under a certain manager:


Advantages of Using a Subquery

  1. Single-Value Comparison in WHERE Clauses: When you need to compare a single value to a result from another query, utilize the subquery in the WHERE clause to enable dynamic data filtering. This enhances query flexibility and precision by allowing on-the-fly condition adjustments based on subquery results.

  1. Column Creation and Aggregation: Utilize subqueries to create new columns for real-time computations and to calculate intermediate values for aggregation functions within larger queries. This enhances your analysis by enriching data on the go and enabling more sophisticated aggregation.

Subqueries are often used with , , and operators to filter results based on conditions from another query.


  1. Correlated Subqueries: Utilize correlated subqueries to retrieve values from the outer query.

Imagine you have two lists of data side by side. With correlated subqueries, you can ask questions about each item in one list and get answers from the other list that are specific to that item. It's like having a conversation between these two lists, where the second list provides unique details for each item in the first list.


CTE & Subquery Example: Identify the Top-Selling Artist

Let's use CTE and subquery to find the top-selling artist within our dataset. You can follow through with the question card below.


Problem Statement

As the lead data analyst for a prominent music event management company, you have been entrusted with a dataset containing concert revenue and detailed information about various artists.

Your mission is to unlock valuable insights by analyzing the concert revenue data and identifying the top-selling artists within each music genre.

Write a query to rank the artists within each genre based on their revenue per member and extract the top-selling artist from each genre.

Desired Output

Display the output of the artist name, genre, concert revenue, number of members, and revenue per band member, sorted by the highest revenue per member within each genre.

Schema
Column NameTypeDescription
artist_idintegerA unique identifier for each artist or band performing in the concert.
artist_namevarchar(100)The name of the artist or band performing in the concert.
genrevarchar(50)The music genre associated with the concert.
concert_revenueintegerThe total revenue generated from the concert.
year_of_formationintegerThe year that the artist or band was formed.
countryvarchar(50)The country of origin or residence of the artist or band.
number_of_membersintegerThe number of members in the band.
album_releasedintegerThe total number of albums released by the artist or band.
labelvarchar(100)The record label or music company associated with the artist or band.
Example Input
artist_idartist_namegenreconcert_revenueyear_of_formationcountrynumber_of_membersalbum_releasedlabel
103Taylor SwiftPop7000002004United States19Republic Records
104BTSK-Pop8000002013South Korea77Big Hit Music
105AdelePop6000002006United Kingdom13Columbia Records
109BlackpinkK-Pop4500002016South Korea45YG Entertainment
110Maroon 5Pop5500001994United States57Interscope Records
Step 1: Calculating Revenue per Member and Ranking Concerts within Each Genre

In both solutions, we calculate the revenue generated per band member for each concert and rank the artists within each genre based on their revenue per member.

Solution #1: Using CTE


Solution #2: Using Subquery


Step 2: Selecting the Top-Selling Artists within Each Genre

In both solutions, we use either CTE or subquery results to extract the top-selling artists in each music genre.

Solution #1: Using CTE


Solution #2: Using Subquery


The results reveal the top-selling artist from each genre, providing valuable insights into the artists' success in their respective music genres.

Interpreting the results:

  1. Beyonce earned a staggering $750,000 in revenue with 1 member, resulting in a revenue per member of $750,000. She holds the top spot in the R&B genre.
  2. Ariana Grande secured the top position in the Pop genre, generating $720,000 in concert revenue with 1 member, resulting in a revenue per member of $720,000.
  3. Eminem takes the lead in the Hip-Hop genre, amassing $620,000 in concert revenue with 1 member, leading to a revenue per member of $620,000.
artist_nameconcert_revenuegenrenumber_of_membersrevenue_per_member
Beyonce750000R&B1750000
Ariana Grande720000Pop1720000
Eminem620000Hip-Hop1620000
The Weeknd620000R&B/Pop1620000
Panic! At The Disco530000Rock1530000
SHINee470000K-Pop4117500

So, in summary, CTEs and subqueries help you break down complicated SQL queries into smaller parts. This makes the code more readable and understandable, especially when you are dealing with multi-step data transformations or analyses. This modularity of SQL code is helpful with debugging because you can test each part of your query independently, making it easier to identify and correct issues.

Practice CTE and Subquery Interview Questions

Let's put your CTE & Subquery skills to the test a real SQL interview questions which use these concepts.

In this Microsoft SQL Interview question, try to use a CTE along with a LEFT JOIN and the DISTINCT keyword to find Azure customers that purchased at least one product from each product category.


In this Zomato SQL Interview question, try to use a subquery with a CASE statement to manage edge cases where the food item’s order ID is equal to the total count of the items.


What's Next: Window Functions

The next advanced SQL topic we'll cover are window functions, which frequently are used with CTEs.

Next Tutorial: SQL Aggregate Window Functions


Next Lesson

WINDOW FUNCTIONπŸͺŸ

Β© 2024 DataLemur, Inc

Career Resources

Free 9-Day Data Interview Crash CourseFree SQL Tutorial for Data AnalyticsUltimate SQL Interview GuideAce the Data Job Hunt Video CourseAce the Data Science InterviewBest Books for Data Analysts