String-text functions are incredibly useful for cleaning your data. These functions will help you manipulate and transform text-based data effortlessly. In this SQL lesson, we'll cover how to use:
The function capitalizes every letter in a string, while the function does the opposite, converting them to lowercase.
Here's an example query:
Running this would yield the following result:
superhero_upper | superhero_lower |
---|---|
AVENGER | avenger |
In the real world, we use these functions because text data is incredibly messy. For example, names are often stored with inconsistent capitalization. By applying or , you can standardize all names to uppercase or lowercase, making it easier to sort the text, or search for strings matching a pattern using .
Write a SQL query which uses and to find a customer's details whose name ends with "son".
The function extracts you the initial characters from a string, while the function retrieves the last characters.
Syntax
Example
left_substring | right_substring |
---|---|
Captain | Widow |
Real-world Scenario: We often extract the first few or last few characters of a product ID to determine its category or manufacturer. For example, when dealing with a flight number like "UA93", using helps us extract "UA", which tells us the carrier is United Airlines.
The function simply returns the length of a string, including the empty space in between.
Syntax
Example
name_length1 | name_length2 |
---|---|
4 | 11 |
Real-world Scenario: We use in a product review system to sort customer reviews based on length. Our intuition here would be that longer reviews are likely more helpful and detailed, whereas short reviews with just 5-6 characters indicate a low-quality review.
The function is used to determine the starting position of a substring within a larger string. It returns the position of the first occurrence of a substring within the specified string.
Syntax
Example
In this example, is used to find the starting position of the substring 'man' within the string 'Ironman'. The position of 'man' is at index 5.
position_result |
---|
5 |
The function removes spaces from both ends of a string. Additionally, we have three more functions to handle trimming from specific sides:
Here's an sample of and it's variants:
full_trim | left_trim | right_trim | combination_trim1 | combination_trim2 |
---|---|---|---|---|
Spiderman | Man | Scarlet | Falcon | Iron Man |
Real-world Scenario: Users entering their names or emails might accidentally include extra spaces. Applying these functions ensures that user input is clean and properly formatted.
The function combines two or more strings into one.
Syntax
Example
character | superhero_alias | |
---|---|---|
Peter Quill | peterquill@avengers.com | Star-Lord |
Real-world Scenario: When registering users, you can concatenate their first name and last name to generate their email addresses, like "nick.singh@datalemur.com".
Let's practice with a real SQL Interview question asked to a Data Analyst at CVS Pharmacy. You're asked to write a query to calculate the total sales for each manufacturer. Round the answer to the nearest million and report your results in descending order of total sales.
Your output should look something like this:
manufacturer | sale |
---|---|
Biogen | $4 million |
Eli Lilly | $3 million |
function concatenates multiple strings into a single string with a specified separator between each string. "CONCAT_WS" stands for "Concatenate With Separator".
Syntax
Example
In following example, is used to concatenate the values of the , , and columns with a hyphen and space () as the separator.
Here's the sample data:
actor | character | superhero_alias |
---|---|---|
Robert Downey Jr. | Tony Stark | Iron Man |
Chris Evans | Steve Rogers | Captain America |
Scarlett Johansson | Natasha Romanoff | Black Widow |
Chris Hemsworth | Thor | Thor |
Mark Ruffalo | Bruce Banner | Hulk |
And, the output:
actor_details |
---|
Robert Downey Jr. - Tony Stark - Iron Man |
Chris Evans - Steve Rogers - Captain America |
Scarlett Johansson - Natasha Romanoff - Black Widow |
Chris Hemsworth - Thor - Thor |
Mark Ruffalo - Bruce Banner - Hulk |
π‘ The main difference between and lies in how they handle separators:
joins strings together without any separators, whereas allows you to specify a separator to be inserted between each string as they are concatenated.
The function returns a portion of a string based on the specified starting position and length. With a negative index, counts characters from the end of the string, offering a different perspective compared to counting from the beginning.
Syntax
Example
substring_1 | substring_2 | substring_3 | substring_4 |
---|---|---|---|
Spider | Widow | n | dow |
In , using a negative start position of -1, the function starts counting from the end of the string and returns the last character, which is 'n'.
In , with a negative start position of -2 and a length of 3, the function starts counting from the second-to-last character ('d') and returns the subsequent three characters, resulting in 'dow'.
Using negative indices provides an easy way to count characters from the end of the string. Just remember that -1 represents the last character, -2 represents the second-to-last character, and so on.
Real-world Scenario: Analyzing server log files, you can use to extract timestamps or error codes, making it easier to identify and troubleshoot issues.
The function enables you to extract specific segments of a string based on a delimiter. This function is particularly useful when dealing with strings containing structured data separated by a common pattern.
Syntax
Example
split_part_1 | split_part_2 | split_part_3 | split_part_4 |
---|---|---|---|
Spider | Widow | Man | Widow |
In , by using a negative index , the function starts counting from the end of the string and returns the last part after splitting, which is 'Man'.
In , with a negative index , the function extracts the last part after splitting based on the space delimiter, returning 'Widow'.
Real-world Scenario: When dealing with file paths, you can utilize to isolate directory names or file extensions, aiding in file management and organization.
We've now covered pretty much every major SQL concept needed to Ace the SQL Interview. In the process, you've solved dozens of smaller SQL interview questions.
Now, it's time to everything we've learned to an open-ended Instacart SQL Analytics case study!
Next Lesson
INSTACART SQL CASE π₯