In our earlier tutorial, you learned about window functions that help with ranking values. Now, we'll introduce you to another window function designed for analyzing time-series data.
and are time-series window functions used to access data from rows that come after, or before the current row within a result set based on a specific column order.
Think of as a function that lets you peek into the future ๐ฎ, and as a way to glance into the past โช. They're like time-travel functions for your data! We frequently use this to create a 7-day moving average metric, or a 28-day rolling count metric.
Here's the general syntax:
Let's use the following question as an example to illustrate the application of the and functions in analyzing stock performance data.
To begin, execute this query to retrieve the foundational table that we'll be working with
Here's the data table:
date | ticker | open | high | low | close |
---|---|---|---|---|---|
01/01/2023 00:00:00 | GOOG | 89.83 | 101.58 | 85.57 | 99.87 |
02/01/2023 00:00:00 | GOOG | 99.74 | 108.82 | 88.86 | 90.30 |
03/01/2023 00:00:00 | GOOG | 90.16 | 107.51 | 89.77 | 104.00 |
04/01/2023 00:00:00 | GOOG | 102.67 | 109.63 | 102.38 | 108.22 |
05/01/2023 00:00:00 | GOOG | 107.72 | 127.05 | 104.50 | 123.37 |
Suppose you want to calculate the difference in closing prices between consecutive months of the year 2023 for the stock with ticker 'GOOG'. Here's the query to achieve that:
When we use the function, it's like taking a peek at the closing price of the next month () and putting it alongside the current month's data ():
And, when we use the function, it's like bringing back the closing price from the previous month () and placing it next to the current month's data ():
Now, since we have both values in the same row as the closing price, we can easily figure out the difference between closing prices of consecutive months and their dates:
Mathematically, this can be expressed as: .
This tells us how much the closing price of the next month differs from the current month's closing price. For the last row, where there is no subsequent month, the column contains .
Let's say we want to calculate the difference between the current month's closing price and the closing price from 3 months ago. Using the same question, we can do so using the feature within the function itself:
The value lets you decide how many rows you want to inspect, whether it's in the future or the past. By default, the is set to 1, meaning it considers the very next or previous row.
For example, if you set the to 1, it means you're looking 1 row ahead (future) or 1 row behind (past). If you set it to 2, you're looking 2 rows ahead or behind, and so on.
In the output below, the column represents the closing price from 3 months ago calculated using the function. The column calculates the difference between the current month's closing price () and the closing price from three months ago ().
Now that you know how and functions work, here's some real-world scenarios where these functions might come in handy:
Scenario: Forecast inventory requirements by analyzing upcoming sales trends in order to adjust stock levels accordingly.
Example Query:
Results:
sales_date | product_id | sales_quantity | next_day_sales |
---|---|---|---|
2023-08-01 | A001 | 100 | 75 |
2023-08-02 | A001 | 75 | 50 |
2023-08-03 | A001 | 50 | 60 |
2023-08-04 | A001 | 60 | 80 |
2023-08-05 | A001 | 80 | 70 |
Test your understanding with a question below!
Scenario: Identify changes in user engagement metrics over time by analyzing page views.
Example Query:
Results:
visit_date | page_views | prev_day_views | next_day_views |
---|---|---|---|
2023-08-01 | 1200 | NULL | 1100 |
2023-08-02 | 1100 | 1200 | 1350 |
2023-08-03 | 1350 | 1100 | 980 |
2023-08-04 | 980 | 1350 | 1500 |
2023-08-05 | 1500 | 980 | 1050 |
Scenario: Optimize route planning based on historical travel times, assisting in identifying traffic patterns.
Example Query:
Results:
route_id | departure_time | arrival_time | prev_arrival_time |
---|---|---|---|
R001 | 08:00 AM | 10:30 AM | NULL |
R001 | 10:30 AM | 12:45 PM | 10:30 AM |
R001 | 12:45 PM | 03:00 PM | 12:45 PM |
R001 | 03:00 PM | 05:15 PM | 03:00 PM |
R001 | 05:15 PM | 07:30 PM | 05:15 PM |
Scenario: Monitor student performance and predict future academic outcomes, identifying areas of improvement or decline.
Example Query:
Results:
student_id | semester | final_grade | prev_semester_grade |
---|---|---|---|
S001 | Fall 2022 | B | B |
S001 | Spring 2023 | A | B |
S001 | Fall 2023 | A | A |
S001 | Spring 2024 | B | A |
S001 | Fall 2024 | B | B |
Practice and functions in this SQL interview question asked by furniture e-commerce company Wayfair!
We'll revisit the concept of a SQL JOIN, but this time join a table to itself using a self-join!
Next Lesson
SQL SELF-JOINS ๐ค