UNION and UNION ALL queries in PostgreSQL with examples.

This blog post will visit an interesting PostgreSQL operator, UNION. Many times, I myself forget about UNION. As of now, I don’t necessarily have a constant use case for it. But through curiosity, exploration, study, and practice I want to know more about it. Let’s learn about this operator within PostgreSQL.

stack of lumber boards on top of one another

Photo by Alex Jones on Unsplash


Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes.

OS and Database:

  • Xubuntu Linux 16.04.3 LTS (Xenial Xerus)
  • PostgreSQL 10.4

To start, I’ll create a couple of tables with some data from a mock DVD Rental database found here using CREATE TABLE AS (check out this blog post I wrote on CREATE TABLE AS) syntax.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
BEGIN TRANSACTION;

DROP TABLE IF EXISTS cust_tbl;

DROP TABLE IF EXISTS acto_tbl;

CREATE TABLE cust_tbl(c_id, c_first_name, c_last_name) AS
SELECT customer_id, first_name, last_name
FROM customer
LIMIT 10
WITH DATA;

CREATE TABLE acto_tbl (act_id, act_first_name, act_last_name)AS
SELECT actor_id, first_name, last_name
FROM actor
LIMIT 10
WITH DATA;

COMMIT;

Here’s the cust_tbl table description and data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
dvdrental=> \d cust_tbl;
                        Table "public.cust_tbl"
    Column    |         Type          | Collation | Nullable | Default
--------------+-----------------------+-----------+----------+---------
 c_id         | integer               |           |          |
 c_first_name | character varying(45) |           |          |
 c_last_name  | character varying(45) |           |          |

dvdrental=> SELECT * FROM cust_tbl;
 c_id | c_first_name | c_last_name
------+--------------+-------------
  524 | Jared        | Ely
    1 | Mary         | Smith
    2 | Patricia     | Johnson
    3 | Linda        | Williams
    4 | Barbara      | Jones
    5 | Elizabeth    | Brown
    6 | Jennifer     | Davis
    7 | Maria        | Miller
    8 | Susan        | Wilson
    9 | Margaret     | Moore
(10 rows)

And the acto_tbl structure and data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
dvdrental=> \d acto_tbl;
                         Table "public.acto_tbl"
     Column     |         Type          | Collation | Nullable | Default
----------------+-----------------------+-----------+----------+---------
 act_id         | integer               |           |          |
 act_first_name | character varying(45) |           |          |
 act_last_name  | character varying(45) |           |          |

dvdrental=> SELECT * FROM acto_tbl;
 act_id | act_first_name | act_last_name
--------+----------------+---------------
      1 | Penelope       | Guiness
      2 | Nick           | Wahlberg
      3 | Ed             | Chase
      4 | Jennifer       | Davis
      5 | Johnny         | Lollobrigida
      6 | Bette          | Nicholson
      7 | Grace          | Mostel
      8 | Matthew        | Johansson
      9 | Joe            | Swank
     10 | Christian      | Gable
(10 rows)

I think of a UNION as essentially stacking one SELECT query results set or row(s) atop another.
The key difference here, as we will see, are rows.
This differs slightly from JOIN‘s, in which multiple columns are displayed side by side.
Not that a UNION cannot or does not display columns side by side as well.
JOIN‘s can return rows (potentially multiple) right?
To gain understanding, let’s apply UNION with 2 SELECT queries that retrieve the first and last names from both cust_tbl and acto_tbl:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
dvdrental=> SELECT c_first_name, c_last_name
dvdrental-> FROM cust_tbl
dvdrental-> UNION
dvdrental-> SELECT act_first_name, act_last_name
dvdrental-> FROM acto_tbl;
 c_first_name | c_last_name  
--------------+--------------
 Maria        | Miller
 Nick         | Wahlberg
 Margaret     | Moore
 Ed           | Chase
 Jared        | Ely
 Penelope     | Guiness
 Patricia     | Johnson
 Jennifer     | Davis
 Grace        | Mostel
 Christian    | Gable
 Elizabeth    | Brown
 Joe          | Swank
 Barbara      | Jones
 Matthew      | Johansson
 Bette        | Nicholson
 Susan        | Wilson
 Johnny       | Lollobrigida
 Mary         | Smith
 Linda        | Williams
(19 rows)

Since both tables were populated with 10 rows of data, a return count of 19 rows doesn’t add up does it?
Let’s see what UNION ALL with the same SELECT queries returns:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
dvdrental=> SELECT c_first_name, c_last_name
dvdrental-> FROM cust_tbl
dvdrental-> UNION ALL
dvdrental-> SELECT act_first_name, act_last_name
dvdrental-> FROM acto_tbl;
 c_first_name | c_last_name  
--------------+--------------
 Jared        | Ely
 Mary         | Smith
 Patricia     | Johnson
 Linda        | Williams
 Barbara      | Jones
 Elizabeth    | Brown
 Jennifer     | Davis
 Maria        | Miller
 Susan        | Wilson
 Margaret     | Moore
 Penelope     | Guiness
 Nick         | Wahlberg
 Ed           | Chase
 Jennifer     | Davis
 Johnny       | Lollobrigida
 Bette        | Nicholson
 Grace        | Mostel
 Matthew      | Johansson
 Joe          | Swank
 Christian    | Gable
(20 rows)

This is an important difference between UNION and UNION ALL.
When combining SELECT results with UNION, no duplicate rows are returned.
However, UNION ALL does return duplicate values.
If these query results were perhaps a mailing list of some sort, then you definitely don’t want to forget anyone.
Therefore, UNION ALL would be the better choice in that scenario.


Let’s go over noteworthy points concerning UNION.
As previously noted, the overall gist of a UNION, combine the returned rows of the query results from each individual SELECT statement.
Also, all UNION queries must have the same number of SELECT columns named in each query.
See with this example:

1
2
3
4
5
6
7
8
dvdrental=> SELECT c_last_name
dvdrental-> FROM cust_tbl
dvdrental-> UNION    
dvdrental-> SELECT act_first_name, act_last_name
dvdrental-> FROM acto_tbl;
ERROR:  each UNION query must have the same number of columns
LINE 4: SELECT act_first_name, act_last_name
               ^

Well, that’s simple enough to remember.
I’ll just add in another column then:

1
2
3
4
5
6
7
8
dvdrental=> SELECT c_id, c_last_name
dvdrental-> FROM cust_tbl
dvdrental-> UNION ALL
dvdrental-> SELECT act_first_name, act_last_name
dvdrental-> FROM acto_tbl;
ERROR:  UNION types integer and character varying cannot be matched
LINE 4: SELECT act_first_name, act_last_name
               ^

Significant point here.
Notice the c_id column is of type integer where as act_first_name is a varchar (or character varying) data type.
Hence, these two dissimilar values can not be matched or compared to one another.
The short of it is, column types named in the SELECT clauses should be comparable within a WHERE clause (e.g., integer = integer, varchar = varchar, etc…).
In the above query, we were trying to compare an integer with a varchar which caused the error.


How complex can UNION queries be?
Can they only combine simple SELECT statements?
In truth, WHERE, GROUP BY, HAVING, and ORDER BY clauses can all be used in the SELECT queries that make up a UNION.
JOIN‘s are welcome to the party as well.
Let’s visit an example that filters with a WHERE clause and the LIKE operator. I’ll use UNION ALL so no duplicate names are omitted from the results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%');
 act_first_name
----------------
 Jennifer
 Johnny
 Joe
 Jared
 Jennifer
(5 rows)


* Notice with this query result, the column names are derived from the columns of the first SELECT statement (table acto_tbl). In previous queries, the cust_tbl was in the first SELECT statement, therefore naming the columns returned from that table. Just keep this in mind if you need to be conscious of column naming for files, reporting, sharing, etc.


Suppose we need those results in alphabetical order. No biggie. Let’s use an ORDER BY clause:

1
2
3
4
5
6
7
8
9
10
11
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> ORDER BY act_first_name ASC
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%');
ERROR:  syntax error at or near "UNION"
LINE 5: UNION ALL
        ^

Oops…
Let me add ORDER BY clause in the second query as well.

1
2
3
4
5
6
7
8
9
10
11
12
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> ORDER BY act_first_name ASC
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%')
dvdrental-> ORDER BY c_first_name ASC;
ERROR:  syntax error at or near "UNION"
LINE 5: UNION ALL
        ^

Hmmm… That’s not working either.
To order these results, we need to use ORDER BY in the last SELECT query.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%')
dvdrental-> ORDER BY act_first_name ASC;
 act_first_name
----------------
 Jared
 Jennifer
 Jennifer
 Joe
 Johnny
(5 rows)

That worked, but I am curious.
Can you name the ORDER BY column from the second SELECT query instead of the first?
Let’s see:

1
2
3
4
5
6
7
8
9
10
11
12
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%')
dvdrental-> ORDER BY c_first_name ASC;
ERROR:  column "c_first_name" does not exist
LINE 8: ORDER BY c_first_name ASC;
                 ^
HINT:  There is a column named "c_first_name" in table "*SELECT* 2", but it cannot be referenced from this part of the query.

I’m even more curious now.
What about ORDER BY column position number?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
dvdrental=> SELECT act_first_name
dvdrental-> FROM acto_tbl
dvdrental-> WHERE act_first_name LIKE ('J%')
dvdrental-> UNION ALL
dvdrental-> SELECT c_first_name
dvdrental-> FROM cust_tbl
dvdrental-> WHERE c_first_name LIKE ('J%')
dvdrental-> ORDER BY 1 ASC;
 act_first_name
----------------
 Jared
 Jennifer
 Jennifer
 Joe
 Johnny
(5 rows)

That does work.
Very interesting.
And a lesson learned!!!
Woot Woot!!!


I’ll close out this blog post, with a mention to this informative blog posts I read and enjoyed COMMON MISTAKES: UNION VS. UNION ALL.
Also, visit these sections in the documentation for more information:

My hope is through this blog post, I have provided a great example of practical uses for both UNION and UNION ALL.
What do you find unique about UNION?
What are some of the best use cases you have used it for?
Feel free to leave comments below.
Be sure and visit the stellar on-line PostgreSQL 10 Documentation for any related questions.


A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the
Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

 

To receive email notifications (Never spam) from this blog (“Digital Owl’s Prose”) for the latest blog posts as they are published, please subscribe (of your own volition) by clicking the ‘Click To Subscribe!’ button in the sidebar on the homepage! (Feel free at any time to review the Digital Owl’s Prose Privacy Policy Page for any questions you may have about: email updates, opt-in, opt-out, contact forms, etc…)
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.


Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

Advertisements

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.