The statement in SQL allows you to shape, transform, and manipulate data based on specified conditions. It's a powerful tool that lets you customize query results, create new categories, and apply conditional logic.
The statement goes through the specified conditions one by one and returns a value when it encounters the first true condition. If no condition is met, the clause is used to provide a default value.
In this tutorial, we'll explore different ways of using the statement with the dataset. This dataset contains information about Marvel Avengers actors and their associated social media presence.
Example:
actor | character | superhero_alias | platform | followers | posts | engagement_rate | avg_likes | avg_comments |
---|---|---|---|---|---|---|---|---|
Robert Downey Jr. | Tony Stark | Iron Man | 500000 | 200 | 8.20 | 12000 | 800 | |
Chris Evans | Steve Rogers | Captain America | 300000 | 150 | 6.50 | 8000 | 500 | |
Scarlett Johansson | Natasha Romanoff | Black Widow | 700000 | 300 | 7.80 | 15000 | 1000 | |
Chris Hemsworth | Thor | Thor | YouTube | 400000 | 100 | 9.10 | 20000 | 1200 |
Mark Ruffalo | Bruce Banner | Hulk | 200000 | 80 | 5.30 | 6000 | 400 |
Here's an example CASE SQL query that determines the popularity category of each Marvel character based on their number of followers ("Popular" is if they have 500,000 or more followers).
Here's the output:
character | superhero_alias | platform | popularity_category |
---|---|---|---|
Tony Stark | Iron Man | Popular | |
Steve Rogers | Captain America | NULL | |
Natasha Romanoff | Black Widow | Popular | |
Thor | Thor | YouTube | NULL |
Bruce Banner | Hulk | NULL |
We can create groups using multiple conditions in the statement. For example, we can categorize the Marvel actors based on their social media engagement rates as follows:
Result
actor | character | engagement_category |
---|---|---|
Robert Downey Jr. | Tony Stark | High Engagement |
Scarlett Johansson | Natasha Romanoff | Moderate Engagement |
Jeremy Renner | Clint Barton | Moderate Engagement |
Tom Hiddleston | Loki | High Engagement |
Zoe Saldana | Gamora | Low Engagement |
When none of the previous conditions match, we can use the clause to provide a default value.
For the Marvel Avengers characters, let's classify their into three categories based on their number of followers:
Result
character | superhero_alias | platform | popularity_category |
---|---|---|---|
Tony Stark | Iron Man | Moderately Popular | |
Steve Rogers | Captain America | Moderately Popular | |
Natasha Romanoff | Black Widow | Highly Popular | |
Thor | Thor | YouTube | Moderately Popular |
Bruce Banner | Hulk | Less Popular |
Now, it's your turn to practice writing a SQL query in an exercise which uses and statements.
Analyze the marvel actors dataset, and categorize them based on the average number of likes:
Let's use the CASE statement, along with , to solve a real SQL Interview Question from NYTimes digital team. In this SQL assesment, your given a table on user viewership categorized by device type where the three types are laptop, tablet, and phone.
Example Input:
user_id | device_type | view_time |
---|---|---|
123 | tablet | 01/02/2022 00:00:00 |
125 | laptop | 01/07/2022 00:00:00 |
128 | laptop | 02/09/2022 00:00:00 |
129 | phone | 02/09/2022 00:00:00 |
145 | tablet | 02/24/2022 00:00:00 |
Write a query that calculates the total viewership for laptops and mobile devices where mobile is defined as the sum of tablet and phone viewership. Output the total viewership for laptops as and the total viewership for mobile devices as .
Example Output:
laptop_views | mobile_views |
---|---|
2 | 3 |
Up until now, we've focused on analyzing a single table of data at a time. Whether it be the , , or datasets, we've only been applying all these fancy arithmetic operators, hunting down nulls, and applying to data that come from a single table.
It's time to change it up, and start analyzing data from TWO tables using the keyword!
Next Lesson
SQL JOINS π€