Aggregate Functions Part 1: COUNT() – With examples in MySQL

This Part 1 of a planned 3 blog posts, aims to cover basic usage of common MySQL aggregate functions. COUNT(*) and COUNT(some_expression) are slightly different variations of the same function and we will discover their differences.

Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.

I will be using Xubuntu Linux 16.04.3 LTS (Xenial Xerus) and MySQL 5.7.21 for these exercises.

You counted what?

Below is the description for the mock employees sample table we will be working with:

mysql> desc employees;
+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| EMPLOYEE_ID    | decimal(6,0) | NO   | PRI | 0       |       |
| FIRST_NAME     | varchar(20)  | YES  |     | NULL    |       |
| LAST_NAME      | varchar(25)  | NO   | MUL | NULL    |       |
| EMAIL          | varchar(25)  | NO   | UNI | NULL    |       |
| PHONE_NUMBER   | varchar(20)  | YES  |     | NULL    |       |
| HIRE_DATE      | date         | NO   |     | NULL    |       |
| JOB_ID         | varchar(10)  | NO   | MUL | NULL    |       |
| SALARY         | decimal(8,2) | YES  |     | NULL    |       |
| COMMISSION_PCT | decimal(2,2) | YES  |     | NULL    |       |
| MANAGER_ID     | decimal(6,0) | YES  | MUL | NULL    |       |
| DEPARTMENT_ID  | decimal(4,0) | YES  | MUL | NULL    |       |
+----------------+--------------+------+-----+---------+-------+
11 rows in set (0.00 sec)

How many total rows are in this table?
We can easily determine this number, using the COUNT(*) aggregate function:

mysql> SELECT COUNT(*) FROM employees;
+----------+
| COUNT(*) |
+----------+
|      107 |
+----------+
1 row in set (0.00 sec)

*Note: A word on COUNT()
COUNT(*) returns the number of all input rows, including any duplicate and/or NULL values.
COUNT(some_expression) returns the number of rows, not counting NULL values.


Let’s query the table for a total number for the JOB_ID column:

mysql> SELECT COUNT(JOB_ID) FROM employees;
+---------------+
| COUNT(JOB_ID) |
+---------------+
|           107 |
+---------------+
1 row in set (0.00 sec)

107 makes sense for this column. All employees in the table should have a JOB_ID right?
COUNT(some_expression) will return the number of rows for some_expression including duplicates.

How do you use COUNT() to determine how many particular values there are for the JOB_ID column?

This query will do it:

mysql> SELECT COUNT(DISTINCT JOB_ID) FROM employees;
+------------------------+
| COUNT(DISTINCT JOB_ID) |
+------------------------+
|                     19 |
+------------------------+
1 row in set (0.00 sec)

The DISTINCT keyword eliminates duplicates and returns only unique values.
That being said, there are 19 unique JOB_ID‘s in this table and here they are:

mysql> SELECT DISTINCT JOB_ID
    -> FROM employees;
+------------+
| JOB_ID     |
+------------+
| AC_ACCOUNT |
| AC_MGR     |
| AD_ASST    |
| AD_PRES    |
| AD_VP      |
| FI_ACCOUNT |
| FI_MGR     |
| HR_REP     |
| IT_PROG    |
| MK_MAN     |
| MK_REP     |
| PR_REP     |
| PU_CLERK   |
| PU_MAN     |
| SA_MAN     |
| SA_REP     |
| SH_CLERK   |
| ST_CLERK   |
| ST_MAN     |
+------------+
19 rows in set (0.00 sec)

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Filtering with COUNT()

We can filter the number of rows COUNT() returns, with a WHERE clause (sometimes).

Let’s see how many rows are present for the'AC_ACCOUNT' JOB_ID:

mysql> SELECT COUNT(*)
    -> FROM employees
    -> WHERE JOB_ID = 'AC_ACCOUNT';
+----------+
| COUNT(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

Only one there.
How about the 'IT_PROG' position?

mysql> SELECT COUNT(*)
    -> FROM employees
    -> WHERE JOB_ID = 'IT_PROG';
+----------+
| COUNT(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

With this result, we see IT_PROG has a total of 5 rows in the table.
I have a great idea. Let’s determine how many JOB_ID departments (the column value), have exactly 5 rows.

mysql> SELECT JOB_ID
    -> FROM employees
    -> WHERE COUNT(*) = 5;
ERROR 1111 (HY000): Invalid use of group function

Turns out, you cannot use WHERE when filtering with an aggregate function.
Instead, use the HAVING clause.

mysql> SELECT JOB_ID
    -> FROM employees
    -> HAVING COUNT(*) = 5;
ERROR 1140 (42000): In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'hr.employees.JOB_ID'; this is incompatible with sql_mode=only_full_group_by

Uh-oh…
Another error…
To successfully execute this query, we need to use the GROUP BY clause.

We’ll get this query working, then further discuss both HAVING and GROUP BY.

mysql> SELECT JOB_ID
    -> FROM employees
    -> GROUP BY JOB_ID
    -> HAVING COUNT(*) = 5;
+------------+
| JOB_ID     |
+------------+
| FI_ACCOUNT |
| IT_PROG    |
| PU_CLERK   |
| SA_MAN     |
| ST_MAN     |
+------------+
5 rows in set (0.00 sec)

We are returned the specific JOB_ID‘s that have 5 rows.

HAVING and GROUP BY

To understand why we need a GROUP BY clause, I’ll provide samples taken directly via the official documentation in Section 12.19.3, “MySQL Handling of GROUP BY” (as of the time of this writing).

These are applicable snippets only. I am omitting some provided code examples for brevity and output’s sake. For full understanding, I recommend reading the entire section.



SQL92 and earlier does not permit queries for which the select list, HAVING condition, or ORDER BY list
refer to nonaggregated columns that are not named in the GROUP BY clause.

Some additional info…


SQL99 and later permits such nonaggregates per optional feature T301 if they are functionally dependent
on GROUP BY columns:

And a bit more…


MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL
mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition,
or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor
are functionally dependent on them. (Before 5.7.5, MySQL does not detect functional dependency and
ONLY_FULL_GROUP_BY is not enabled by default.


Let’s look at this passage once more,

MySQL rejects queries for which the select list, HAVING condition,
or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor
are functionally dependent on them.

This all boils down to naming a non-aggregated column (JOB_ID) in the SELECT list and it, the named SELECT list column, is not present in a GROUP BY clause.


*Note: Visit these two links for more great information regarding MySQL and GROUP BY.


We also must use the HAVING clause because we are filtering with an aggregate function, COUNT(). The WHERE clause cannot refer to aggregates, however HAVING can refer to aggregates.

More Aggregating…

To this point, we have used the COUNT() aggregate function, obtaining results for the number of input rows in a table. Leveraging both WHERE and HAVING clauses, allows filtering result sets appropriately with various uses of COUNT().
In the next blog post, I will visit two additional aggregate functions, SUM() and AVG().
GROUP BY, WHERE, and HAVING all will continue along in this series as we progress since they too, are integral components for filtering result sets with aggregate functions.
Hope to see you there!

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.


Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.