RECURSIVE CTE Fun with SUBSTRING() in MySQL

Although I primarily work with MySQL/MariaDB these days on back-end web development projects, I am studying Oracle SQL with the end goal to become certified. I recently watched a great YouTube video and decided to try my hand porting over the Oracle SQL queries to comparable MySQL for a learning exercise. This blog post details the queries I came up with for the end results. Continue reading for some fun recursive CTE’s using the SUBSTRING() character function and interesting output display variations.


Self-Promotion:

If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!


Credit Resource and Inspiration

In the below video, the author provides several examples of displaying a specific word in different output formats using Oracle SQL’s CONNECT BY clause and ROWNUM pseudo-column. Siva Academy is a fantastic channel about Oracle SQL. The author is truly knowledgeable and always has interesting examples to demonstrate. I highly recommend checking out this channel if you’re interested in Oracle SQL. I was inspired to write this blog post because of this video and want to give accolades to the poster.

Inspirational video by Siva Academy.

Recursive CTE, user-defined session variable, the SUBSTRING() and LENGTH() functions

For easier and more convenient handling, I’ll use a user-defined session variable to represent the word ‘CHANGES’ shown here with the SET command:

SET @changes = 'CHANGES';

At the time of writing, MySQL does not support a CONNECT BY clause or the ROWNUM pseudo-column found in Oracle SQL. Therefore, to replicate the functionality of both constructs, I implemented some workarounds.

Fortunately, MySQL 8 introduced the WITH clause (also referred to as a Common Table Expression). More so, the RECURSIVE WITH clause plays a critical role in mimicking the ROWNUM pseudo-column, in this specific context, due to its recursive nature. Therefore, if you plan on following along with the example queries, you need MySQL 8 or higher.

For the duration of this post and subsequent queries, assume the recursive CTE, rownum_generator, (shown below) is in scope:

WITH RECURSIVE rownum_generator AS (
SELECT 1 AS rownum
UNION ALL
SELECT rownum + 1
FROM rownum_generator
WHERE rownum < LENGTH(@changes)
)

Retrieving all rows from the rownum_generator CTE returns an incrementing set of integer values ending with 7, which is the length of the target string ‘CHANGES’:

Recursive CTE with SUBSTRING
Query results from the rownum_generator CTE.

I’ll make use of 2 more MySQL functions throughout the post: SUBSTRING() and LENGTH(). SUBSTRING() accepts up to 3 parameters; 2 required and 1 optional. The 2 required parameters are target_string and start_position. The lone optional parameter is the length value.

SUBSTRING() returns a portion of a string depending on the supporting parameters (or lack thereof). How SUBSTRING() is manipulated with the recursive CTE dictates the unique output displays.

Related: Read the post, Oracle SQL SUBSTR() Character function – Medium cross-post, for more information on Oracle’s similar character function.

Recursive CTE and SUBSTRING(): 1 character per row for each character

Reviewing this first query, the objective is to return 1 character per row for all characters in the word ‘CHANGES’.

SELECT
SUBSTRING(@changes, rownum, 1) AS print_down
FROM rownum_generator;
Mysql-workbench-recursive-cte
Output display of 1 character per row.

The start_position parameter is dynamic based on the rownum_generator CTE column, rownum. Therefore, each starting position changes. Also, the length parameter specifies only 1 character to be extracted for this display variation. A more visible example can be realized by returning the rownum value at each rows’ interval in the query results:

SELECT
SUBSTRING(@changes, rownum, 1) AS print_down, rownum
FROM rownum_generator;
Mysql-substring-character-function
Dynamic rownum value for each row.

Recursive CTE and SUBSTRING(): 1 character per row for each character starting at the end of the string

I was thrown for a loop with this next example, until I again returned the moving rownum value alongside the SUBSTRING() function call in the query results. Here I query 1 letter from the target string per row, but starting extraction at the end of the ‘CHANGES’ string due to the negative rownum number (shown in the follow-up query):

SELECT
SUBSTRING(@changes, rownum * - 1, 1) AS print_up
FROM rownum_generator;
One character per row, from the end of the target string.
SELECT
SUBSTRING(@changes, rownum * - 1, 1) AS print_up, rownum * - 1
FROM rownum_generator;
Negative start_position values for SUBSTRING()

Recursive CTE and SUBSTRING(): Dynamic number of characters per row, starting at the end of the string

The below output display is very similar to the previous one with the exception that where only 1 character was extracted from the target_string, now I extract a dynamic number of characters based on whatever the value of rownum is for that particular row:

SELECT
SUBSTRING(@changes, rownum * - 1, rownum) AS last_to_entire
FROM rownum_generator;
Dynamic start_position and length parameters for SUBSTRING().

Again, setting the start_position parameter to the rownum * – 1 expression value begins extraction at the end of the target_string because it is negative. But, instead of extracting only 1 character, the number of characters is dictated by the rownum column value. This works out to be the same number needed to extract the remaining portion of the ‘CHANGES’ string.


I write a weekly email about SQL/PHP that I’m studying, learning about, and interested in. If that sounds like something you would like to be a part of, find out more about it here. Thank you!


Recursive CTE and SUBSTRING(): Dynamic number of characters per row, starting with the 1st character

The most straightforward example of all, in my opinion, is shown in the follwoing query results. I simply set the start_position parameter to 1 and the length parameter to the rownum column value.

SELECT
SUBSTRING(@changes, 1, rownum) AS first_to_entire
FROM rownum_generator;
SUBSTRING() results with start_position of 1, and length parameter of the dynamic rownum value.

For example, in reviewing the last row, the entire word ‘CHANGES’ is returned in the query results. Why?

Having a start_position of 1, and a rownum column value of 7 for the length parameter value (at this point in the execution), all 7 characters of the text are returned by SUBSTRING(). Simple enough right?

Recursive CTE and SUBSTRING(): All characters to the last character

In those instances where a length parameter is not provided to SUBSTRING(), the function returns the remaining number of characters found in the target_string from the start_position and that’s exactly how this next example query works using the rownum column value to set where extraction should begin:

SELECT
SUBSTRING(@changes, rownum) AS entire_to_last
FROM rownum_generator;

Recursive CTE and SUBSTRING(): Substitute missing characters with a space

I must be frank here. The final 2 examples gave me the most grief out of all the queries so far. Without the use of a CASE expression, I’m not sure I would’ve been able to solve these last 2 riddles, allthough the YouTube author did not use CASE in the demonstrated examples.

Related: Check out the post, Dynamic RPAD() Function using a CASE Expression – Oracle SQL Example, for another example of a dynamic CASE expression I borrowed heavily from :).

Similar: Unfamiliar with the CASE expression? Not to worry. I wrote 2 posts that have you covered. See, MySQL Simple CASE Expression – with examples, and MySQL Searched CASE Expression – with examples to learn more.

Below are the query results:

SELECT
LPAD(SUBSTRING(@changes, rownum),
CASE WHEN LENGTH(SUBSTRING(@changes, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, rownum)) + LENGTH(SUBSTRING(@changes, rownum)) ELSE LENGTH(@changes) END
, ' ') AS entire_diag_to_last
FROM rownum_generator;
Padding SUBSTRING() return value with a single space character.

With the overall goal to pad the SUBSTRING(@changes, 1, rownum) expression with a number of space characters equal to and not-to-exceed a length of 7, (the length of the ‘CHANGES’ string), I use the dynamic CASE expression for the 2nd LPAD() parameter. Which is the final length of the string to be padded once all padding is applied.

Let’s execute the query and display the CASE expression value alongside the LPAD() function output for better clarity:

SELECT
LPAD(SUBSTRING(@changes, rownum),
CASE WHEN LENGTH(SUBSTRING(@changes, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, rownum)) + LENGTH(SUBSTRING(@changes, rownum)) ELSE LENGTH(@changes) END
, ' ') AS entire_diag_to_last,
CASE WHEN LENGTH(SUBSTRING(@changes, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, rownum)) + LENGTH(SUBSTRING(@changes, rownum)) ELSE LENGTH(@changes) END AS cse_lt
FROM rownum_generator;
Query results for dynamic CASE expressions as 2nd parameter for LPAD().

Notice the cse_lt column value is a consistent 7. This is the desired length of the string after any padding is applied by LPAD(). In this manner, the correct number of space characters are padded to the return value of SUSTRING() dependent upon what portion is extracted based on the rownum column value.

Coming to the final query example, it is more of the same as the previous example with the exception that a 1 is used for the SUBSTRING() start parameter and we extract a number of characters equal to the rownum value length parameter.

SELECT
LPAD(SUBSTRING(@changes, 1, rownum),
CASE WHEN LENGTH(SUBSTRING(@changes, 1, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, 1, rownum)) + LENGTH(SUBSTRING(@changes, 1, rownum)) ELSE LENGTH(@changes) END
, ' ') AS first_to_entire_diag
FROM rownum_generator;

Consider making a donation as I continue to provide valuable content here on my blog. Thank you so much!!! Every bit is much appreciated and helps tremendously!


Recursive CTE and SUBSTRING(): All output displays

I have provided all solutions below in a single query:

SELECT
SUBSTRING(@changes, rownum, 1) AS print_down,
SUBSTRING(@changes, rownum * - 1, 1) AS print_up,
SUBSTRING(@changes, rownum * - 1, rownum) AS last_to_entire,
SUBSTRING(@changes, 1, rownum) AS first_to_entire,
SUBSTRING(@changes, rownum) AS entire_to_last,
LPAD(SUBSTRING(@changes, rownum),
CASE WHEN LENGTH(SUBSTRING(@changes, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, rownum)) + LENGTH(@changes) ELSE LENGTH(@changes) END
, ' ') AS entire_diag_to_last,
LPAD(SUBSTRING(@changes, 1, rownum),
CASE WHEN LENGTH(SUBSTRING(@changes, 1, rownum)) < LENGTH(@changes) THEN LENGTH(@changes) - LENGTH(SUBSTRING(@changes, 1, rownum)) + LENGTH(SUBSTRING(@changes, 1, rownum)) ELSE LENGTH(@changes) END
, ' ') AS first_to_entire_diag
FROM rownum_generator;

I hope you enjoyed this post. I know I had a load of fun learning the different patterns and writing about them. Try some of your own and let me know about them in the comments. Thank you for reading

If you have any questions or see any mistakes in the code, please provide a comment in the comments section. In doing so, I can continue to provide valuable and accurate blog posts.


Like what you have read? See anything incorrect? Please comment below and thank you for reading!!!

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.



To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)

Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.

Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.