Back to questions

Top 5 Artists Spotify SQL Interview Question

Top 5 Artists

Spotify SQL Interview Question

Assume there are three Spotify tables: , , and , which contain information about the artists, songs, and music charts, respectively.

Write a query to find the top 5 artists whose songs appear most frequently in the Top 10 of the table. Display the top 5 artist names in ascending order, along with their song appearance ranking.

If two or more artists have the same number of song appearances, they should be assigned the same ranking, and the rank numbers should be continuous (i.e. 1, 2, 2, 3, 4, 5). If you've never seen a rank order like this before, do the rank window function tutorial.

Table:

Column Name	Type
artist_id	integer
artist_name	varchar
label_owner	varchar

Example Input:

artist_id	artist_name	label_owner
101	Ed Sheeran	Warner Music Group
120	Drake	Warner Music Group
125	Bad Bunny	Rimas Entertainment

Table:

Column Name	Type
song_id	integer
artist_id	integer
name	varchar

Example Input:

song_id	artist_id	name
55511	101	Perfect
45202	101	Shape of You
22222	120	One Dance
19960	120	Hotline Bling

Table:

Column Name	Type
day	integer (1-52)
song_id	integer
rank	integer (1-1,000,000)

Example Input:

day	song_id	rank
1	45202	5
3	45202	2
1	19960	3
9	19960	15

Example Output:

artist_name	artist_rank
Ed Sheeran	1
Drake	2

Explanation:

Ed Sheeran's song appeared twice in the Top 10 list of global song rank while Drake's song is only listed once. Therefore, Ed is ranked #1 and Drake is ranked #2.

The dataset you are querying against may have different input & output - this is just an example!

Solution

Step 1: Retrieve the song counts for the 10 artists

To start, we combine the table with the and tables using appropriate joins.

Then, we calculate the number of times each artist's song appears in the top 10 of the chart, grouping the results by the artist. This query provides us with a list of artists and their corresponding song counts.

Step 2: Rank the artists based on song appearances in the top 10 chart

Now that we have the artists and their song counts, we need to assign ranks to them based on the number of times their songs appear in the top 10 chart. We can achieve this using the window function.

The resulting query provides a list of artists, their song counts, and their respective ranks in descending order of song appearances.

Do you know why we're using instead of ?

While both functions assign the same rank to duplicates, skips the next number in the ranking, which is not suitable for this question. Hence, we use , which ensures that all ranks are assigned without skipping any numbers.

Let's consider the following example:

artist_name	song_count	dense_rank_num	rank_num
Bad Bunny	6	1	1
Ed Sheeran	5	2	2
Lady Gaga	4	3	3
Adele	4	3	3
Katy Perry	3	4	5

In this example, Lady Gaga and Adele both have 4 songs, so they should have the same rank. With , both artists are correctly ranked as 3rd, and the next artist, Katy Perry is ranked 4th. However, with , the 4th rank is skipped, so Katy Perry is labeled as the 5th rank. This breaks the continuity of the ranking.

For our question, it's important to have a continuous ranking without skipping any numbers. Therefore, we use to ensure all ranks are assigned correctly.

Step 3: Retrieve the top 5 ranked artists along with their ranks

Finally, we need to filter the results to include only the top 5 ranked artists.

Sourced from

Spotify

Difficulty

Medium

Input

(PostgreSQL)

Output