To normalize data, we often store it in several tables within a given database. While querying those tables individually can provide answers to numerous questions around that data, more times than not, it leaves us ‘coming up short’ for the complete answer we need.
That poses the question of, how do you extract all this data from several tables that depend on one another to provide data-driven answers?
One tried and true method is to use a JOIN
operation.
Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.
OS and Database:
- Xubuntu Linux 16.04.3 LTS (Xenial Xerus)
- MySQL 5.7.22
There are multiple types of JOIN
‘s in MySQL (and SQL in general), however the focus for this blog post will be the INNER JOIN
.
This type of JOIN
enables you to link columns from multiple tables on matching values.
I will target these three tables from the mock DVD rental Sakila database.
I need to narrow down a result set to something more manageable in regards to output, brevity, and on-screen display for this blog post.
So to start things off, I’ll jump right into joining on two of those tables.
I’ll center the query around a country
with a city
count of 6.
The below INNER JOIN
query, along with handy use of COUNT()
, GROUP BY
, and HAVING
will give me something to work with.
-> FROM city
-> INNER JOIN country
-> ON city.country_id = country.country_id
-> GROUP BY country
-> HAVING COUNT(city) = 6;
+-------------+----------+
| COUNT(city) | country |
+-------------+----------+
| 6 | Colombia |
| 6 | Egypt |
| 6 | Ukraine |
| 6 | Vietnam |
+-------------+----------+
4 rows in set (0.00 sec)
Let’s look at the INNER JOIN
and what it does.
You will notice that both tables share a sort of ‘common’ column named country_id
.
In table country
, this is the PRIMARY KEY
, and in table city
, it’s a FOREIGN KEY
.
We can use these columns to ‘link’ or JOIN
these tables where their values match.
To do that, we use the ON
clause as part of the JOIN
(See more below).
To note:
- In MySQL the
INNER
keyword is optional for this type ofJOIN
. According to the documentation in section 13.2.9.2 Join Syntax,CROSS JOIN
is interchangeable syntax as well. I have tested (not shown) all three variations:JOIN
,INNER JOIN
, andCROSS JOIN
; All are compatible with the query above. Important: This behavior is applicable to MySQL only and not compliant with standard SQL. - The
ON
clause names aJOIN
condition in how the tables are linked on a matching row. - Without the
HAVING
andGROUP BY
clauses, this query will return all matching rows. To reiterate, I mainly use them here in the query to demonstrate with better screen output for blog posts. Those mentioned clauses are not mandatory or evenJOIN
syntax.
However, it is perfectly fine and often useful to filter with any one or a combination of the available clauses when used in context appropriately.
I’ll also directly provide important information from the official documentation (link in closing section) located in that same 13.2.9.2 JOIN Syntax section as mentioned above.
The conditional_expr used with ON is any conditional expression of the form that can be used in a WHERE clause. Generally, the ON clause serves for conditions that specify how to join tables, and the WHERE clause restricts which rows to include in the result set.
In other words, we use ON
in this query to specify a match condition for rows where the country_id
column value from table city
, is exactly the same as the country_id
column in table country
.
This is a boolean condition that executes in identical fashion as that of a WHERE
clause.
Basically, a test for truth.
If a row meets the specified condition, it is deemed to be true.
I have an idea.
How about we include the country_id
column in the SELECT
list for even more clarity:
-> FROM city
-> INNER JOIN country
-> ON city.country_id = country.country_id
-> GROUP BY country
-> HAVING COUNT(city) = 6;
ERROR 1052 (23000): Column 'country_id' in field list is ambiguous
Well, which country_id
did we mean?
That’s interesting.
MySQL would also like to know.
Keep this in mind when joining and querying from multiple tables that may have columns with identical names.
You have to be specific and provide clarity to the database (MySQL) of which column you are requesting.
You may be thinking column country
didn’t throw this ambiguous error?
That’s correct.
Because that column only exists in the country
table and MySQL knows without a doubt which one we are referencing.
So how do you avoid these ambiguous errors?
You ‘qualify’ the identically named columns.
By prefixing the column names with their respective parent table name, this avoids any confusion and MySQL will then know which column(s) from which table(s) you requested.
I’ll include the country_id
column from the city
table in this query:
-> FROM city
-> INNER JOIN country
-> ON city.country_id = country.country_id
-> GROUP BY country, city.country_id
-> HAVING COUNT(city) = 6;
+-------------+----------+------------+
| COUNT(city) | country | country_id |
+-------------+----------+------------+
| 6 | Colombia | 24 |
| 6 | Egypt | 29 |
| 6 | Ukraine | 100 |
| 6 | Vietnam | 105 |
+-------------+----------+------------+
4 rows in set (0.01 sec)
By specifying the country_id
from table city
, MySQL knows exactly which column from which table we want.
A touch on aliasing…
Aliasing is an alternate syntax for referencing table names.
Instead of using the full table name for the qualifying prefix, you can alias a table name in the FROM
clause:
-> FROM city AS ci
-> INNER JOIN country AS co
-> ON ci.country_id = co.country_id
-> GROUP BY co.country, ci.country_id
-> HAVING COUNT(ci.city) = 6;
+----------------+----------+------------+
| COUNT(ci.city) | country | country_id |
+----------------+----------+------------+
| 6 | Colombia | 24 |
| 6 | Egypt | 29 |
| 6 | Ukraine | 100 |
| 6 | Vietnam | 105 |
+----------------+----------+------------+
4 rows in set (0.00 sec)
We can use table referencing, with one of two syntax structures:
So in this query,
and
are equivalent and perfectly acceptable in this context.
* Note: As mentioned, the AS
keyword is optional in this context. However, including it could enhance readability and clarity. Therefore, I strive to consistently include it in these particular use cases.
Now, anywhere you reference a column from either table city
or country
, use ci
or co
respectively in lieu of the full table name.
There is an alternate clause available for specifying the JOIN
condition.
The USING
clause.
Column(s) named in the USING
clause, must be present in both tables.
Here is an example:
-> FROM city AS ci
-> INNER JOIN country AS co
-> USING(country_id)
-> GROUP BY co.country
-> HAVING COUNT(ci.city) = 6;
+-------------+----------+
| COUNT(city) | country |
+-------------+----------+
| 6 | Colombia |
| 6 | Egypt |
| 6 | Ukraine |
| 6 | Vietnam |
+-------------+----------+
4 rows in set (0.00 sec)
* Note: The parenthesis surrounding country_id
are required in the USING
clause.
While not shown here, USING
can accept multiple comma-separated column names for the JOIN
condition.
What if your results set has to come from more than 2 tables?
Can JOIN
‘s accomplish that?
Sure can.
In the picture provided at the beginning of the post, you see a third address
table.
Let’s add that table into the mix here.
Suppose we need to determine all the city names and addresses available within the database for the country
'Egypt'
.
This can be solved with a multi-table JOIN
.
The below query includes the necessary JOIN
‘s to return that particular result set:
-> FROM city AS ci
-> INNER JOIN address AS a
-> ON ci.city_id = a.city_id
-> INNER JOIN country AS co
-> ON ci.country_id = co.country_id
-> WHERE co.country = 'Egypt';
+------------------+----------------------------------+-------------+---------+
| city | address | postal_code | country |
+------------------+----------------------------------+-------------+---------+
| Bilbays | 1741 Hoshiarpur Boulevard | 22372 | Egypt |
| Idfu | 1736 Cavite Place | 98775 | Egypt |
| Mit Ghamr | 775 ostka Drive | 22358 | Egypt |
| Qalyub | 765 Southampton Drive | 4285 | Egypt |
| Sawhaj | 1727 Matamoros Place | 78813 | Egypt |
| Shubra al-Khayma | 1987 Coacalco de Berriozbal Loop | 96065 | Egypt |
+------------------+----------------------------------+-------------+---------+
6 rows in set (0.01 sec)
So just what is going on with all this joining?
Let’s look at our example query line by line and better understand the operation.
1 2 3 4 5 6 7 | mysql> SELECT ci.city, a.address, a.postal_code, co.country -> FROM city AS ci -> INNER JOIN address AS a -> ON ci.city_id = a.city_id -> INNER JOIN country AS co -> ON ci.country_id = co.country_id -> WHERE co.country = 'Egypt'; |
- Line 1 provides the columns we are requesting via the
SELECT
clause. - Lines 2 and 3: MySQL is combining every row present in the
city
table (named in theFROM
clause) with every row in theaddress
table (named as the table toJOIN
). - Line 4: At this point, the combined results set may not be the desired results you are after. Here is where the
ON
orUSING
clause comes into the picture. By employingON
, this sets up a ‘match’ condition that must be met (true) for each ‘matching/linked’ row for both tables involved in theJOIN
condition. Two mini-pointers are worth a mention here:- This condition should be on ‘like’ data types (e.g., 2 = 2, ‘Programmer’ = ‘Programmer’, ‘2012-10-08’ = ‘2012-10-08’).
- However, they need to also make sense. In other words, just because the strings ‘Egypt’ and ‘Columbia’ are both in fact, character data types; in this context, a match on them does not make sense and likely will not yield correct results. Remember, this is a test for truth. ‘Egypt’ = ‘Egypt’ will return true, while ‘Egypt’ = ‘Columbia’ will not.
- Rinse and repeat as lines 5 and 6 are identical to the operations from lines 2 and 3. All rows from table
city
are joined to all rows in tablecountry
. Then theON
clause specifies that thecountry_id
column value from both tables, must match or be equal. - Line 7: Here, the
WHERE
clause filters rows that must meet specific criteria to make the final result set returned from the database. For this query, thecountry
column from tablecountry
must have the value'Egypt'
on a ‘matched’ row.
Leveraging JOIN
‘s, we can combine linked or related data from multiple tables, returning meaningful metrics and insights.
JOIN
‘s can assist in answering questions like i.e., authors and books they wrote, manufactures and the cars they make, musicians and albums they’ve released, etc…
It should be noted although not covered in this blog post, JOIN
‘s are also applicable to multi-table UPDATE
and DELETE
operations.
Explore the official MySQL 5.7 Online Manual for more information.
A Call To Action!
Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.
Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.
Have I mentioned how much I love a cup of coffee?!?!
To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.
Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.
Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.
4 thoughts on “INNER JOIN in MySQL with examples.”