What’s missing? Find out using OUTER JOIN’s in MySQL with examples.

What’s missing? What matches in one table but not another? MySQL (and SQL in general) have a way to find out. More than likely, when working with a normalized database, you are faced with these types of questions. Querying multiple tables with JOIN‘s on matched values, only goes so far to answering this type of question. We will see how to leverage an OUTER JOIN, for these requests.

Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.

OS and Database:

  • Xubuntu Linux 16.04.3 LTS (Xenial Xerus)
  • MySQL 5.7.22

 

todd-diemer-181900-unsplash.jpg

Photo by Todd Diemer on Unsplash

Here are a couple of questions OUTER JOIN‘s can answer for us:

  • Determine which rows, if any, from one table have no match in another table.
  • Determine missing values with a test for NULL. (Example provided below)

I’ll use two tables that form a mock pipe tracking unit, representing assets with and without degrees that form a ‘bend’.

The structure of the tables is setup based on the has_degree column in table pipe.
The tinyint(1) data type represents MySQL’s version of a boolean value. Anywhere 1 is present this represents ‘truth’, meaning that particular record has a matching row in table degree_value. Hence that row (or pipe asset) has a degree_amount and is a ‘bend’.

Both rows are ‘joined’ and matched on the pipe_id column present in both tables.
Table degree_value provides the number (in degrees) by way of the degree_amount column for that matching row.
I’ll query for data present in both tables, then look at how we can leverage an OUTER JOIN on this data set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
mysql> SELECT * FROM pipe;
+---------+-------------+-----------+-------+-------------+------------+
| pipe_id | pipe_name   | joint_num | heat  | pipe_length | has_degree |
+---------+-------------+-----------+-------+-------------+------------+
|     181 | Joint-278   | 39393A    | 9111  |       17.40 |          1 |
|     182 | Joint-8819  | 19393Y    | 9011  |       16.60 |          0 |
|     183 | Joint-9844  | 39393V    | 8171  |       10.40 |          0 |
|     184 | Joint-2528  | 34493U    | 9100  |       11.50 |          1 |
|     185 | Joint-889   | 18393z    | 9159  |       13.00 |          0 |
|     186 | Joint-98434 | 19293Q    | 8174  |        9.13 |          0 |
|     187 | Joint-78344 | 17QTT     | 179   |       44.40 |          1 |
|     188 | Joint-171C  | 34493U    | 17122 |        9.45 |          1 |
|     189 | Joint-68444 | 17297Q    | 6114  |       11.34 |          0 |
|     190 | Joint-4841R | 19395Q    | 5144  |       25.55 |          0 |
|     191 | Joint-1224C | 34493U    | 8575B |       15.22 |          1 |
|     192 | Joint-2138  | 34493C    | 91    |       13.55 |          1 |
|     193 | Joint-122B  | 34493U    | 9100B |        7.78 |          1 |
+---------+-------------+-----------+-------+-------------+------------+
13 rows in set (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> SELECT * FROM degree_value;
+---------+---------------+
| pipe_id | degree_amount |
+---------+---------------+
|     181 |         12.50 |
|     184 |         22.50 |
|     187 |         10.00 |
|     188 |          4.00 |
|     191 |         10.25 |
|     192 |          5.75 |
|     193 |         11.75 |
+---------+---------------+
7 rows in set (0.00 sec)

Suppose we are tasked with determining what rows in the pipe table, do not have a matching row in table degree_value?
Well, we could always query with an INNER JOIN (see this blog post for INNER JOIN examples in-depth) and then manually track the pipe_id‘s from table pipe that are not a part of the query results.
Here is that specific INNER JOIN:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> INNER JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     181 | Joint-278   |         12.50 |
|     184 | Joint-2528  |         22.50 |
|     187 | Joint-78344 |         10.00 |
|     188 | Joint-171C  |          4.00 |
|     191 | Joint-1224C |         10.25 |
|     192 | Joint-2138  |          5.75 |
|     193 | Joint-122B  |         11.75 |
+---------+-------------+---------------+
7 rows in set (0.01 sec)

But, there is a better way to do this.
And a prime use case for an OUTER JOIN.

The above INNER JOIN query returns rows that do match.
But we want those rows that do not match.
Those which are not a ‘bend’.

A OUTER JOIN returns rows from table’s involved where there is a match for rows on the specified value, in addition to rows where there is no match.
MySQL provides both a LEFT and RIGHT variant.

In truth, the OUTER keyword is optional as shown with the next 2 queries:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT OUTER JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     191 | Joint-1224C |         10.25 |
|     193 | Joint-122B  |         11.75 |
|     188 | Joint-171C  |          4.00 |
|     192 | Joint-2138  |          5.75 |
|     184 | Joint-2528  |         22.50 |
|     181 | Joint-278   |         12.50 |
|     190 | Joint-4841R |          NULL |
|     189 | Joint-68444 |          NULL |
|     187 | Joint-78344 |         10.00 |
|     182 | Joint-8819  |          NULL |
|     185 | Joint-889   |          NULL |
|     186 | Joint-98434 |          NULL |
|     183 | Joint-9844  |          NULL |
+---------+-------------+---------------+
13 rows in set (0.00 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     191 | Joint-1224C |         10.25 |
|     193 | Joint-122B  |         11.75 |
|     188 | Joint-171C  |          4.00 |
|     192 | Joint-2138  |          5.75 |
|     184 | Joint-2528  |         22.50 |
|     181 | Joint-278   |         12.50 |
|     190 | Joint-4841R |          NULL |
|     189 | Joint-68444 |          NULL |
|     187 | Joint-78344 |         10.00 |
|     182 | Joint-8819  |          NULL |
|     185 | Joint-889   |          NULL |
|     186 | Joint-98434 |          NULL |
|     183 | Joint-9844  |          NULL |
+---------+-------------+---------------+
13 rows in set (0.00 sec)

Identical results with or without the optional OUTER keyword.


* Note: Moving forward, I will omit the OUTER keyword where appropriate.


Replacing ON with a USING clause is also optional syntax:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT JOIN degree_value AS dv
    -> USING(pipe_id);
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     191 | Joint-1224C |         10.25 |
|     193 | Joint-122B  |         11.75 |
|     188 | Joint-171C  |          4.00 |
|     192 | Joint-2138  |          5.75 |
|     184 | Joint-2528  |         22.50 |
|     181 | Joint-278   |         12.50 |
|     190 | Joint-4841R |          NULL |
|     189 | Joint-68444 |          NULL |
|     187 | Joint-78344 |         10.00 |
|     182 | Joint-8819  |          NULL |
|     185 | Joint-889   |          NULL |
|     186 | Joint-98434 |          NULL |
|     183 | Joint-9844  |          NULL |
+---------+-------------+---------------+
13 rows in set (0.00 sec)

The first table named in the LEFT JOIN, pipe, is considered the ‘left’ table and table degree_value is considered the ‘right’.
Through the above queries, wherever the degree_amount column value is NULL, that represents no match on the ‘right’ table (degree_value).

So in this example query, NULL indicates that row (the pipe entity) does not have a degree_amount (no match).

Here is an explanation provided from the official documentation (at the time of this writing) in section 13.2.9.2 Join Syntax.


If there is no matching row for the right table in the ON or USING part in a LEFT JOIN, a row with all
columns set to NULL is used for the right table. You can use this fact to find rows in a table that have no
counterpart in another table


Matter of fact, we can restrict the results set leveraging a WHERE clause, returning only rows with NULL for column dv.degree_amount.

Let’s see:

1
2
3
4
5
6
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id
    -> WHERE dv.degree_amount = NULL;
Empty set (0.00 sec)

Wait a minute.
There are rows that should match that condition.
Ah ha.
Silly me.
I need to wrap NULL in quotes.

1
2
3
4
5
6
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id
    -> WHERE dv.degree_amount = 'NULL';
Empty set, 1 warning (0.00 sec)

No query results.
And a warning.
Let’s look at that too while we are at it.

1
2
3
4
5
6
7
mysql> SHOW warnings;
+---------+------+-------------------------------------------+
| Level   | Code | Message                                   |
+---------+------+-------------------------------------------+
| Warning | 1292 | Truncated incorrect DECIMAL value: 'NULL' |
+---------+------+-------------------------------------------+
1 row in set (0.00 sec)

What’s going on here?


* Handy Tidbit:
I’ll share something with you that plagued me for some time.
NULL is a special value that is not equivalent to anything.
Not 0 (zero).
Not an empty string (e.g., ' ').
Not a space (e.g.,  ).
Not even NULL itself.


How then do you test a column for NULL?
You still test for truth, but with the IS keyword.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> LEFT JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id
    -> WHERE dv.degree_amount IS NULL;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     190 | Joint-4841R |          NULL |
|     189 | Joint-68444 |          NULL |
|     182 | Joint-8819  |          NULL |
|     185 | Joint-889   |          NULL |
|     186 | Joint-98434 |          NULL |
|     183 | Joint-9844  |          NULL |
+---------+-------------+---------------+
6 rows in set (0.03 sec)

Now we are getting somewhere.
These are explicit results that can be further processed, if needed, to answer the question of what pipe_id‘s do not have a degree_amount, hence are not a ‘bend’ entity in this context.

Earlier in the post, I alluded to the differences between the LEFT and RIGHTtables.

So let’s ‘flop’ the original query, naming degree_value as the RIGHT table and see what those query results are:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM pipe AS p
    -> RIGHT JOIN degree_value AS dv
    -> ON p.pipe_id = dv.pipe_id;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     181 | Joint-278   |         12.50 |
|     184 | Joint-2528  |         22.50 |
|     187 | Joint-78344 |         10.00 |
|     188 | Joint-171C  |          4.00 |
|     191 | Joint-1224C |         10.25 |
|     192 | Joint-2138  |          5.75 |
|     193 | Joint-122B  |         11.75 |
+---------+-------------+---------------+
7 rows in set (0.00 sec)

That results set looks familiar, doesn’t it?
Same query results from the above-demonstrated INNER JOIN.
What about naming pipe for the RIGHT JOIN table?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mysql> SELECT p.pipe_id, p.pipe_name, dv.degree_amount
    -> FROM degree_value AS dv
    -> RIGHT JOIN pipe AS p
    -> ON dv.pipe_id = p.pipe_id;
+---------+-------------+---------------+
| pipe_id | pipe_name   | degree_amount |
+---------+-------------+---------------+
|     191 | Joint-1224C |         10.25 |
|     193 | Joint-122B  |         11.75 |
|     188 | Joint-171C  |          4.00 |
|     192 | Joint-2138  |          5.75 |
|     184 | Joint-2528  |         22.50 |
|     181 | Joint-278   |         12.50 |
|     190 | Joint-4841R |          NULL |
|     189 | Joint-68444 |          NULL |
|     187 | Joint-78344 |         10.00 |
|     182 | Joint-8819  |          NULL |
|     185 | Joint-889   |          NULL |
|     186 | Joint-98434 |          NULL |
|     183 | Joint-9844  |          NULL |
+---------+-------------+---------------+
13 rows in set (0.00 sec)

That yields identically to the original LEFT JOIN query with the same implications for the NULL value in the degree_amount column.

Recommended Reading

There is a wealth of detail and information in the official documentation from the following sections I highly recommend consulting:

I find LEFT JOIN‘s highly effective (and interesting) when exploring unfamiliar data sets.
Plying one to monitor for any NULL‘s on the ‘right’ table provides insight of what is missing or incomplete.
Try them out for yourself to discover how complete or incomplete your data set is.
I would love to know of those use cases where they have enhanced answering your toughest questions, exposed other notable metrics, or surprised you during initial data discovery/exploration phase.
Feel free to leave any comments.
Explore the official MySQL 5.7 Online Manual for more information.

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.


Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

Advertisements

One thought on “What’s missing? Find out using OUTER JOIN’s in MySQL with examples.

Hey thanks for commenting! Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.