logo

SQL NULL Tutorial With Practice Exercises

In SQL, indicates the absence of a value. Unlike an empty string or zero, doesn't represent a specific value, but rather, a missing or unknown piece of information.

Here's an overview of what you'll learn in this tutorial:

  • and : Used to identify null and non-null values.
  • : Returns the first non-null value from a list of arguments.
  • : Substitutes null value with a specified value specified.

Why Is It Important to Handle NULL Values?

Handling NULL values is a common task in various real-world scenarios where data may be incomplete or missing. In the realm of databases, values serve as placeholders for uncertain or absent data.

Here are a few examples of real-world scenarios:

  1. Employee Records: In a human resources database, some employees might not have entered their emergency contact information yet, resulting in NULL values in that column.
  2. Survey Responses: When collecting survey responses, participants might choose not to answer certain questions, leading to NULL values in those fields.
  3. Student Grades: In an educational database, some students might not have received grades for certain assignments, leading to NULL values in the grade column.
  4. Health Records: In a patient's medical history, some test results might not be available yet, resulting in NULL values in the test result columns.

In essence, values in SQL are like flexible placeholders that cater to the uncertainties and variations in real-world data. They allow databases to accommodate missing or pending information without resorting to assumptions or default values.

Exploring NULL Values in SQL: Analyzing Bestselling Books on Goodreads

To demonstrate the usage of SQL functions for handling values, we'll be working with an illustrative table named that contaings details about the best-selling books across diverse genres.

book_titleauthorbook_genreyear_releasedcountrypublicationbook_ratingnumber_of_reviews
Ace the Data Science InterviewKevin Huo, Nick SinghNon-Fiction2021United StatesAce the Data Science Interview4.5850
BecomingMichelle ObamaBiography2018United StatesCrown4.711600
Blink: The Power of Thinking Without ThinkingMalcolm GladwellNon-Fiction2005United StatesLittle Brown4.28400
Crazy Rich AsiansKevin KwanRomance2013ChinaAnchor3.88900

Identifying NULL Values with IS NULL and IS NOT NULL

Imagine you're a data analyst at Goodreads and you're identifying records where the book titles are null to ensure data completeness and integrity.


The expected result would be an empty result set because there are no records in the table where the column is explicitly set to NULL.

While it might seem like the query is looking for records where the is blank, SQL doesn't work that way with blank values. The word in SQL doesn't mean empty like it does in everyday language. Instead, it means "unknown" or "missing". So, when SQL tries to compare something to using , it doesn't work like you might expect. In fact, it doesn't work at all! It's like trying to compare apples and orangesβ€”they're just not the same thing.

That's why SQL has special functions for handling null values which we'll go through below:

IS NULL

Now, to correctly identify values, we can utilize the condition:


This query retrieves all records where the column is indeed .

IS NOT NULL

And, to retrieve rows with actual values (not ) in a column, utilize :


This query delivers rows where the column isn't NULL.

πŸ’‘ Fun Fact

Did you know that in SQL's sorting order, takes a special place as the smallest value? When we sort a column containing values, these rows ascend to the top of the result. This unique behavior can be both surprising and useful, impacting how query results are organized.

Tesla Null SQL Interview Question

Let's solve a real Data Analyst SQL Interview Question from Tesla, where your given a table of car parts, and the date they were finished being assembled (if they were finished, else it's NULL).

partfinish_dateassembly_step
battery01/22/2022 00:00:001
battery02/22/2022 00:00:002
battery03/22/2022 00:00:003
bumper01/22/2022 00:00:001
bumper02/22/2022 00:00:002
bumperNULL3
bumperNULL4

Your asked to write a SQL query which checks for nulls to determine which car parts have begun the assembly process but are not yet finished:


Refining Data with SQL COALESCE()

The function takes multiple inputs and returns the first non-null value.

Syntax:


If is NULL, it returns the specified 'expression'. Otherwise, it returns the value of .

Let's consider the scenario with the table:

book_titleauthorbook_genreyear_releasedbook_rating
Ace the Data Science InterviewKevin Huo, Nick SinghNon-Fiction20214.5
BecomingMichelle ObamaBiography20184.7
Blink: The Power of Thinking Without ThinkingMalcolm GladwellNon-Fiction2005NULL
Crazy Rich AsiansKevin KwanRomance20133.8
Data Engineering with PythonPaul Crickard and Andreas MΓΌllerNon-Fiction2019NULL

Now, let's apply the function to the column:


Here's how the results look:

book_titlecoalesced_rating
Ace the Data Science Interview4.5
Becoming4.7
Blink: The Power of Thinking Without Thinking0.0
Crazy Rich Asians3.8
Data Engineering with Python0.0

In this example, the function checks each value. If a value isn't NULL, then it's returned unchanged. However, if is NULL, then function replaces it with 0.

Handling Gaps with SQL IFNULL()

Suppose some of the book ratings are NULL, we can use the function to fill in the gaps with default values.

Syntax:


Consider the scenario using the table:


In this instance, the function evaluates each value, and if a value is present, it remains unchanged. However, if is , the function substitutes it with 0.

Here's the expected results:

book_titlerated_books
Ace the Data Science Interview4.5
Becoming4.7
Blink: The Power of Thinking Without Thinking0.0
Crazy Rich Asians3.8
Data Engineering with Python0.0

Difference between COALESCE() and IFNULL()

While both the and functions serve a similar purpose of handling values, there is a key difference between them.

function: Versatile for multiple arguments, it returns the first non-null value among them.


function: Handles two arguments, returning the second if the first is null; else, it returns the first.


Example

Suppose you have these values:

  • is NULL
  • is 5
  • is 10

With , the result would be 5 (the first non-null value). However, if is NULL, then the result would be 10 (the first non-null value).

With , the result would be 5 (since is NULL, it returns the second argument, which is 5).

So, while they both can serve similar purposes,

  • offers more flexibility when dealing with multiple values,
  • Whereas is a more concise option when you have two values to work with.

What's Next: CASE

What if we wanted re-code all s to some special value?

Like for an e-commerce company, if the number of sales was then treat it as 0? Or in a user-survey, where you ask people to rate statements from "strongly disagree" to "strongly agree" (likert scale), you could treat all s as "Neither Agree Nor Disagree"?

In these cases, to treat NULL values conditionally as something else, we can use the statement – covered in the next lesson!

Next Tutorial: CASE Statements


Next Lesson

SQL CASE πŸ’Ό