Back to questions
Amazon Web Services (AWS) is powered by fleets of servers. Senior management has requested data-driven solutions to optimize server usage.
Write a query that calculates the total time that the fleet of servers was running. The output should be in units of full days.
In the example output, the combined uptime of all the servers (from each start time to stop time) totals 21 full days.
The dataset you are querying against may have different input & output - this is just an example!
This is a tough task. Let's zoom out and work backwards to structure our approach.
The solution below is broken down into smaller steps.
Step 1: Start and Stop times Starting from the end, the first thing we should do is obtain the start and stop times in different columns.
We'll alias as , and then use the window function to find the stop times, which correspond to the timestamps for each successive status after the current one. We'll make sure to partition this for each individual server and order our results by the .
The full query should look like this:
Click here to learn more about the function.
Let's take a step back to make sure we understand the output of this query. The output rows should look like this:
|1||start||08/02/2022 10:00:00||08/04/2022 10:00:00|
|1||start||08/13/2022 10:00:00||08/13/2022 19:00:00|
We have the server IDs along with their stop and start times! That means we're ready to move on.
Step 2: Individual Uptimes Now, let's wrap the above query in a Common Table Expression (CTE) so we can reuse the results later. Click here to learn more about CTEs.
We are close to finishing this – all we have to do is assemble our main statement. First, we need to simply calculate the individual uptimes by subtracting from . Then, we'll the individual server times, and convert the result to days.
Step 3: Converting to full days By summing up the uptimes as an interval type, we could obtain multiple date parts in our results. For example, the output could be: "days":2, "hours":30. In this case, if we extracted just the days, we would obtain 2 as a result (instead of 3).
To combat this issue, we'll use the function to convert any value above 24 hours into units of full days. You can learn more about here.
Now, we can extract the number of full days by using function. We should arrive at the following code:
Finally, we want to make sure that we only look at the uptimes for rows that start server sessions, and that we exclude rows where the stop time is a value.
The query looks like this:
Congratulations, you did it!