logo

SQL NULL Tutorial With Practice Exercises

In SQL, indicates the absence of a value. Unlike an empty string or zero, doesn't represent a specific value, but rather, a missing or unknown piece of information.

Why Is It Important to Handle NULL Values?

Handling NULL values is a common task in various real-world scenarios where data may be incomplete or missing. In the realm of databases, values serve as placeholders for uncertain or absent data.

Here are a few examples of real-world scenarios:

  1. Employee Records: In a human resources database, some employees might not have entered their emergency contact information yet, resulting in NULL values in that column.
  2. Survey Responses: When collecting survey responses, participants might choose not to answer certain questions, leading to NULL values in those fields.
  3. Student Grades: In an educational database, some students might not have received grades for certain assignments, leading to NULL values in the grade column.
  4. Health Records: In a patient's medical history, some test results might not be available yet, resulting in NULL values in the test result columns.

In essence, values in SQL are like flexible placeholders that cater to the uncertainties and variations in real-world data. They allow databases to accommodate missing or pending information without resorting to assumptions or default values.

Exploring NULL Values in SQL: Analyzing Bestselling Books on Goodreads

To demonstrate the usage of SQL functions for handling values, we'll be working with an illustrative table named that contaings details about the best-selling books across diverse genres.

book_titleauthorbook_genreyear_releasedcountrypublicationbook_ratingnumber_of_reviews
Ace the Data Science InterviewKevin Huo, Nick SinghNon-Fiction2021United StatesAce the Data Science Interview4.5850
BecomingMichelle ObamaBiography2018United StatesCrown4.711600
Blink: The Power of Thinking Without ThinkingMalcolm GladwellNon-Fiction2005United StatesLittle Brown4.28400
Crazy Rich AsiansKevin KwanRomance2013ChinaAnchor3.88900

Identifying NULL Values with IS NULL and IS NOT NULL

Consider this SQL query:


At first glance, it appears to retrieve rows with values in the column. However, there's a subtle but crucial misunderstanding here.

SQL's unique behavior defies straightforward comparisons. Even common equality checks like fails because doesn't equate to any value, not even itself. This holds true for , , , , or comparisons.

IS NULL

To correctly identify values, utilize the condition:


This query retrieves all records where the column is indeed .

IS NOT NULL

To retrieve rows with actual values (not ) in a column, utilize :


This query delivers rows where the column isn't NULL.

πŸ’‘ Fun Fact

Did you know that in SQL's sorting order, takes a special place as the smallest value? When we sort a column containing values, these rows ascend to the top of the result. This unique behavior can be both surprising and useful, impacting how query results are organized.

Tesla Null SQL Interview Question

Let's solve a real Data Analyst SQL Interview Question from Tesla, where your given a table of car parts, and the date they were finished being assembled (if they were finished, else it's NULL).

partfinish_dateassembly_step
battery01/22/2022 00:00:001
battery02/22/2022 00:00:002
battery03/22/2022 00:00:003
bumper01/22/2022 00:00:001
bumper02/22/2022 00:00:002
bumperNULL3
bumperNULL4

Your asked to write a SQL query which checks for nulls to determine which car parts have begun the assembly process but are not yet finished:


Refining Data with COALESCE()

Imagine you're dealing with data where missing values (represented as NULL) can hinder your analysis. We can use the handy function. It takes multiple inputs and intelligently returns the first non-null value.

Let's consider the scenario with the table:

book_titleauthorbook_genreyear_releasedbook_rating
Ace the Data Science InterviewKevin Huo, Nick SinghNon-Fiction20214.5
BecomingMichelle ObamaBiography20184.7
Blink: The Power of Thinking Without ThinkingMalcolm GladwellNon-Fiction2005NULL
Crazy Rich AsiansKevin KwanRomance20133.8
Data Engineering with PythonPaul Crickard and Andreas MΓΌllerNon-Fiction2019NULL

Now, applying the function to the column:


Here's how the results look like:

book_titlecoalesced_rating
Ace the Data Science Interview4.5
Becoming4.7
Blink: The Power of Thinking Without Thinking0.0
Crazy Rich Asians3.8
Data Engineering with Python0.0

In this example, the function adeptly steps in. It checks each value. If a value isn't null, it's returned unchanged. However, if happens to be null, the function gracefully swaps it with 0.

Handling Data Gaps with SQL IFNULL()

Imagine you're navigating through data containing gaps due to missing values, often represented as . Here's where the function comes into play, offering an elegant solution to address these gaps.

Consider the scenario using the table:


In this instance, the function shines. It inspects each value, and if a value is present, it remains unchanged. However, when is , the function seamlessly substitutes it with 0.

Now, let's see how this translates into actual results:

book_titlerated_books
Ace the Data Science Interview4.5
Becoming4.7
Blink: The Power of Thinking Without Thinking0.0
Crazy Rich Asians3.8
Data Engineering with Python0.0

In these results, you can witness how the function adeptly replaces the NULL values in the "book_rating" column with the value 0, creating a more uniform and coherent data representation.

Difference between COALESCE() and IFNULL()

While both the and functions serve a similar purpose of handling values, there is a key difference between them.

Function: Versatile for multiple arguments, it returns the first non-null value among them.


Function: Handles two arguments, returning the second if the first is null; else, it returns the first.


Example

Suppose you have these values:

  • is NULL
  • is 5
  • is 10

With , the result would be 5 (the first non-null value). However, if is NULL, then the result would be 10 (the first non-null value).

With , the result would be 5 (since is NULL, it returns the second argument, which is 5).

So, while they both can serve similar purposes,

  • offers more flexibility when dealing with multiple values,
  • Whereas is a more concise option when you have two values to work with.

What's Next: CASE

What if we wanted re-code all s to some special value?

Like for an e-commerce company, if the number of sales was then treat it as 0? Or in a user-survey, where you ask people to rate statements from "strongly disagree" to "strongly agree" (likert scale), you could treat all s as "Neither Agree Nor Disagree"?

In these cases, to treat NULL values conditionally as something else, we can use the statement – covered in the next lesson!

Next Tutorial: CASE Statements


Next Lesson

SQL CASE πŸ’Ό