ROW_NUMBER() Window Function – find duplicate values.

Many times, we do not want duplicate rows or values in our SQL tables. On the other hand, in some situations, it does not matter if there are duplicates present. For whatever reason, suppose duplicates have found their way into one of your tables. How can you find them quickly and easily? The ROW_NUMBER() Window function is a fantastic tool to use. Continue reading and see example queries you can apply to your own tables and find those duplicates…

pile of identical plug like circuit chips blue in color
Photo by Joe Green on Unsplash

OS and DB used:

  • Linux Mint 20 “Ulyana”
  • MySQL 8.0.22


Self-Promotion:

If you enjoy the content written here, by all means, share this blog and your favorite post(s) with others who may benefit from or like it as well. Since coffee is my favorite drink, you can even buy me one if you would like!


Visit this fantastic YouTube video where I learned a great deal for this blog post.


For the examples in this post, we have this arbitrary table with some test data:

mysql-sql [learning]> SELECT * FROM some_names;
+----+------------+
| id | first_name |
+----+------------+
|  1 | Max        |
|  2 | Mary       |
|  3 | Charlie    |
|  4 | Humpty     |
|  5 | Roger      |
|  6 | Jim        |
|  7 | Jupyter    |
|  8 | Max        |
|  9 | Mary       |
| 10 | Charlie    |
| 11 | Humpty     |
| 12 | Roger      |
| 13 | Jim        |
| 14 | Jupyter    |
| 15 | Jim        |
| 16 | Jim        |
| 17 | Max        |
| 18 | Max        |
| 19 | Mary       |
+----+------------+
19 rows in set (0.0011 sec)

You can likely pick out the few duplicates that are present since this is a rather small result set. However, with a larger result set, not so easy. Let’s put ROW_NUMBER() to work in finding the duplicates.

But first, let’s visit the online window functions documentation on ROW_NUMBER() and see the syntax and description:

ROW_NUMBER() OVER()

“Returns the number of the current row within its partition. Rows numbers range from 1 to the number of partition rows. ORDER BY affects the order in which rows are numbered. Without ORDER BY, row numbering is nondeterministic.”


ROW_NUMBER() Window Function with empty OVER() clause in MySQL

To start, let’s query the data with ROW_NUMBER() and no arguments in the OVER() clause:

mysql-sql [learning]> SELECT id, first_name, ROW_NUMBER() OVER()
                   -> FROM some_names;
+----+------------+---------------------+
| id | first_name | ROW_NUMBER() OVER() |
+----+------------+---------------------+
|  1 | Max        |                   1 |
|  2 | Mary       |                   2 |
|  3 | Charlie    |                   3 |
|  4 | Humpty     |                   4 |
|  5 | Roger      |                   5 |
|  6 | Jim        |                   6 |
|  7 | Jupyter    |                   7 |
|  8 | Max        |                   8 |
|  9 | Mary       |                   9 |
| 10 | Charlie    |                  10 |
| 11 | Humpty     |                  11 |
| 12 | Roger      |                  12 |
| 13 | Jim        |                  13 |
| 14 | Jupyter    |                  14 |
| 15 | Jim        |                  15 |
| 16 | Jim        |                  16 |
| 17 | Max        |                  17 |
| 18 | Max        |                  18 |
| 19 | Mary       |                  19 |
+----+------------+---------------------+
19 rows in set (0.0013 sec)

Based on the query results above, we can see that ROW_NUMBER() provides an incrementing set of numbers in order, starting at 1 and ending with a number equal to that of the total number of rows.

ROW_NUMBER() Window Function with PARTITION BY clause in MySQL

How can you isolate duplicate rows using ROW_NUMBER()?

The optional PARTITION BY clause of the OVER clause will create sub-groups of rows for its specified value. For the purposes of this blog post, we will assume that we want to target the ‘first_name’ column and find any duplicates for that column.

In a nutshell, PARTITION BY first_name will subdivide the ‘first_name’ column into separate groups of identical values. An example query provides a better, visual understanding:

mysql-sql [learning]> SELECT id, first_name,
                   -> ROW_NUMBER() OVER(PARTITION BY first_name) AS r_num
                   -> FROM some_names;
+----+------------+-------+
| id | first_name | r_num |
+----+------------+-------+
|  3 | Charlie    |     1 |
| 10 | Charlie    |     2 |
|  4 | Humpty     |     1 |
| 11 | Humpty     |     2 |
|  6 | Jim        |     1 |
| 13 | Jim        |     2 |
| 15 | Jim        |     3 |
| 16 | Jim        |     4 |
|  7 | Jupyter    |     1 |
| 14 | Jupyter    |     2 |
|  2 | Mary       |     1 |
|  9 | Mary       |     2 |
| 19 | Mary       |     3 |
|  1 | Max        |     1 |
|  8 | Max        |     2 |
| 17 | Max        |     3 |
| 18 | Max        |     4 |
|  5 | Roger      |     1 |
| 12 | Roger      |     2 |
+----+------------+-------+
19 rows in set (0.0012 sec)

Note that the ‘first_name’ column is now grouped by identical values. Also, each time the ‘first_name’ sub-group changes to a different value, ROW_NUMBER() restarts counting at 1 again and continues in consecutive order until another change occurs within the ‘first_name’ sub-groups.

The OVER() clause also accepts an optional ORDER BY clause that will sort the rows according to the specified value. Here I use the ‘id’ column in ASC order. The ‘id’ column is the table’s PRIMARY KEY so this value is UNIQUE and a solid candidate column to sort by:

mysql-sql [learning]> SELECT id, first_name,
                   -> ROW_NUMBER() OVER(PARTITION BY first_name ORDER BY id ASC) AS r_num
                   -> FROM some_names;
+----+------------+-------+
| id | first_name | r_num |
+----+------------+-------+
|  3 | Charlie    |     1 |
| 10 | Charlie    |     2 |
|  4 | Humpty     |     1 |
| 11 | Humpty     |     2 |
|  6 | Jim        |     1 |
| 13 | Jim        |     2 |
| 15 | Jim        |     3 |
| 16 | Jim        |     4 |
|  7 | Jupyter    |     1 |
| 14 | Jupyter    |     2 |
|  2 | Mary       |     1 |
|  9 | Mary       |     2 |
| 19 | Mary       |     3 |
|  1 | Max        |     1 |
|  8 | Max        |     2 |
| 17 | Max        |     3 |
| 18 | Max        |     4 |
|  5 | Roger      |     1 |
| 12 | Roger      |     2 |
+----+------------+-------+
19 rows in set (0.0177 sec)

ROW_NUMBER() Window function with Derived Table in MySQL

All that remains to do is filter the query results based on the ROW_NUMBER() value, targeting any value greater than 1 for each sub-group. Any records that match that criteria are duplicates.

Simple enough, right? Just use the WHERE clause:

mysql-sql [learning]> SELECT id, first_name
                   -> FROM some_names
                   -> WHERE ROW_NUMBER() OVER(PARTITION BY first_name ORDER BY id ASC) > 1;
ERROR: 3593: You cannot use the window function 'row_number' in this context.

Uh-oh. Turns out, you cannot use a window function like this in the WHERE clause predicate.

Now what?

If you wrap the ROW_NUMBER() query in a derived table, then you can perform the filtering in the WHERE clause:

mysql-sql [learning]> SELECT t.id, t.first_name, t.r_num
                   -> FROM
                   -> (
                   -> SELECT id, first_name, ROW_NUMBER() OVER(PARTITION BY first_name ORDER BY id ASC) AS r_num
                   -> FROM some_names
                   -> ) AS t
                   -> WHERE t.r_num > 1
                   -> ORDER BY t.first_name ASC;
+----+------------+-------+
| id | first_name | r_num |
+----+------------+-------+
| 10 | Charlie    |     2 |
| 11 | Humpty     |     2 |
| 13 | Jim        |     2 |
| 15 | Jim        |     3 |
| 16 | Jim        |     4 |
| 14 | Jupyter    |     2 |
|  9 | Mary       |     2 |
| 19 | Mary       |     3 |
|  8 | Max        |     2 |
| 17 | Max        |     3 |
| 18 | Max        |     4 |
| 12 | Roger      |     2 |
+----+------------+-------+
12 rows in set (0.0011 sec)

And there are the duplicates, easily found using the ROW_NUMBER() Window Function. Try it yourself next time you need to locate any duplicates in your tables.

Like what you have read? See anything incorrect? Please comment below and thank you for reading!!!

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.



To receive email notifications (Never Spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the β€˜Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)

Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.

Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided, are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

7 thoughts on “ROW_NUMBER() Window Function – find duplicate values.

  1. Instead of using a subquery, it would probably be more efficient to use a CTE β€œWITH” in order to be able to WHERE by the ROW_NUMBER, no?

    • Hi Jan. Yes a CTE would probably be more efficient I would think also. Thanks for the comment.

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.