How To Combine Window Functions in MySQL

Window Functions operate on groups of rows, returning a value for every row in the group, unlike Aggregate Functions which collapse rows into a single value. With the many different Window Functions available, can you combine them at the same level? Learn more in this article.

The Newsletter for PHP and MySQL Developers

Receive a copy of my ebook, “10 MySQL Tips For Everyone”, absolutely free when you subscribe to the OpenLampTech newsletter.

Image by Clker-Free-Vector-Images from Pixabay 

Backstory and Reasoning

I recently worked on some queries using Oracle Spatial functions to build geospatial line strings. As part of one of the function parameters, I needed to total up a running distance measurement from a starting location (a coordinate) on the line string, at each coordinate interval.

This sounds like a cut-n-dry use case for a rolling sum total using SUM() and OVER(). And, you’re correct, it is.

Almost.

Sample data

For simplicity, I’ll forego using any complicated Northings, Eastings, or Linear References in the examples and instead settle on this simple data set for the blog post:

SELECT * FROM coords_measure;

Since I primarily blog about – and use – MySQL or MariaDB in most of the LAMP stack applications I develop, I’ll use MySQL for these examples as well (although the requirements I needed were originally in Oracle SQL).

You may notice that the last row of the result set has NULL values for the start_segment, end_segment, and segment_length columns.

This is because that location is the last point (coordinate) on the line string and has no real measure length; basically the ‘dead end’.

Rolling SUM() with MySQL Window Functions

Calculating a rolling sum is not that difficult since we have Window Functions as of MySQL v8.

As shown in this query, simply using the SUM() aggregate function in conjunction with the OVER() clause and a specific ordering gives us this information:

SELECT 
id, start_segment, end_segment,
SUM(segment_length) OVER(ORDER BY id ASC) AS rolling_sum
FROM coords_measure;

Although we have a rolling sum calculation, it is incorrect in this context. You can see that the last 2 rows have 195.83 as the rolling sum value.

Yet, what we really need for this particular requirement is for the last row only to have 195.83 for the rolling sum calculation and the first row to have 0.


πŸ“° Get your brand, product, or service the attention it needs with affordable classified ad placement in the OpenLampTech newsletter. Thank you for your support!


Almost as if we need to shift all rolling sum values down one row.

But why?

For this geospatial requirement, at each coordinate point, the distance should build from the start or beginning coordinate (row 1).

Therefore, row 1 should actually have a measure distance of 0 since it is the start of the line string (and has no distance value – it is the beginning) while row 2 would have the value of the distance measured between these 2 coordinates since that is the distance back to the start (coordinate 1).

MySQL LAG() Window Functions

The LAG() Window Function allows you to access data located in previous rows of a result set, from the current operating row. You can access 1 (the default) or more rows depending on the 2nd parameter supplied in the LAG() function call.

I’ll use LAG() in the query and return each previous rows’ segment_length value:

SELECT 
id, start_segment, end_segment,
LAG(segment_length, 1) OVER(ORDER BY id ASC) AS one_lag
FROM coords_measure;

As shown in the query results, in the case there are no previous rows’ values, LAG() returns NULL.


βœ”οΈ In my experience, you need a meaningful metric you can use in the ORDER BY clause of LAG() to make it work for you as you expect if you depend on a specific ordering.


Now we simply perform the rolling sum right?

Can we combine 2 Window Functions at the same level?

No.

Executing this query returns a specific error indicating that LAG() is not allowed here:

SELECT 
id, start_segment, end_segment,
SUM(LAG(segment_length, 1) OVER(ORDER BY id ASC)) OVER(ORDER BY id ASC) AS one_lag
FROM coords_measure;
Error Code: 3593. You cannot use the window function 'lag' in this context.

MySQL Derived Tables

A SELECT query that generates a table in the FROM clause is known as a Derived Table in MySQL.

In order to calculate the rolling sum of the LAG() function output, we should place the generating query in the FROM clause using a Derived Table:

SELECT t.id, t.start_segment, t.end_segment,
SUM(t.one_lag) OVER(ORDER BY t.id ASC) AS rolling_sum
FROM
(SELECT
id, start_segment, end_segment,
LAG(segment_length, 1) OVER(ORDER BY id ASC) AS one_lag
FROM coords_measure) AS t;

πŸ—’οΈ Tip: MySQL Derived Tables should be aliased. However, the AS keyword is optional.


As a good practice, we should account for NULL in the LAG() window function call so that our math in the SUM() OVER() is correct or safer when performing the calculation. One simple way in MySQL is using the IFNULL() function.

SELECT t.id, t.start_segment, t.end_segment,
SUM(t.one_lag) OVER(ORDER BY t.id ASC) AS rolling_sum
FROM
(SELECT
id, start_segment, end_segment,
IFNULL(LAG(segment_length, 1) OVER(ORDER BY id ASC), 0) AS one_lag
FROM coords_measure) AS t;

Now as shown in the query results, we have the correct distance measure as a rolling sum, at each coordinate interval.


Recommended Reading: 5 SQL Books That Will Make You A Pro.


Honestly, there absolutely may be other ways to retrieve these query results but this one worked for me for this specific requirement. I learned a great deal about LAG() and rolling sum calculations with Window Functions as I worked on this particular query and I hope you also have.

Thank you for reading this post. Please share it with someone else who would enjoy it as well.




Josh Otwell has a passion to grow as a PHP Developer, SQL expert, and technical blogger/writer.

Disclaimer: The majority of examples in this post, are performed in a personal development/learning workstation environment and should not be considered production quality or ready. Your particular goals and needs may vary. Like always, just because you can do something doesn’t mean you should. My opinions are my own.

More ways I can help

Disclosure: Some of the services and product links in this post are affiliate links. At no additional cost to you, should you make a purchase by clicking through one of them, I will receive a commission.

The Newsletter for PHP and MySQL Developers

Receive a copy of my ebook, “10 MySQL Tips For Everyone”, absolutely free when you subscribe to the OpenLampTech newsletter.


πŸ“° Get your brand, product, or service the attention it needs with affordable classified ad placement in the OpenLampTech newsletter. Thank you for your support!


Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.