Aggregate Functions part 2: SUM() and AVG() in MySQL with examples.

In Aggregate Functions Part 1: COUNT() – With examples in MySQL, we visited examples of the COUNT() aggregate function and explored different uses for its application.
As I continue this series on aggregate functions, this blog post will look at 2 more functions: SUM() and AVG(). Both which assist in calculating numeric values.

Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.

OS and DB used:

  • Xubuntu Linux 16.04.3 LTS (Xenial Xerus)
  • MySQL 5.7.21

roman-mager-59976-unsplash

Photo by Roman Mager on Unsplash

SUM()and AVG()

SUM(<em>some_expression</em>) calculates a total for the expression, while AVG(<em>some_expression</em>) returns the average value.
The DISTINCT keyword is applicable to both functions, returning only those unique values for the aggregated column or expression.
In the event no matching rows are found, NULL is returned. Both functions share this characteristic.

Working Examples.

The target table is the same mock HR employees table we used in the previous blog post:

mysql> desc employees;
+----------------+--------------+------+-----+---------+-------+
| Field          | Type         | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| EMPLOYEE_ID    | decimal(6,0) | NO   | PRI | 0       |       |
| FIRST_NAME     | varchar(20)  | YES  |     | NULL    |       |
| LAST_NAME      | varchar(25)  | NO   | MUL | NULL    |       |
| EMAIL          | varchar(25)  | NO   | UNI | NULL    |       |
| PHONE_NUMBER   | varchar(20)  | YES  |     | NULL    |       |
| HIRE_DATE      | date         | NO   |     | NULL    |       |
| JOB_ID         | varchar(10)  | NO   | MUL | NULL    |       |
| SALARY         | decimal(8,2) | YES  |     | NULL    |       |
| COMMISSION_PCT | decimal(2,2) | YES  |     | NULL    |       |
| MANAGER_ID     | decimal(6,0) | YES  | MUL | NULL    |       |
| DEPARTMENT_ID  | decimal(4,0) | YES  | MUL | NULL    |       |
+----------------+--------------+------+-----+---------+-------+
11 rows in set (0.00 sec)

The SALARY column is a great choice to test these functions. Since it is a numeric decimal(8,2) data type, we can use both functions on this column’s values, and gain better understanding of what each accomplish.

If you want to determine the total sum of all values for that column, which function would you use?
You guessed it, SUM().

mysql> SELECT SUM(SALARY)
    -> FROM employees;
+-------------+
| SUM(SALARY) |
+-------------+
|   691400.00 |
+-------------+
1 row in set (0.00 sec)

What about the average for the SALARY column?
Right again you are (In my Yoda voice). This is where AVG() is useful:

mysql> SELECT AVG(SALARY)
    -> FROM employees;
+-------------+
| AVG(SALARY) |
+-------------+
| 6461.682243 |
+-------------+
1 row in set (0.00 sec)

As you can see from the above examples, use of these functions is relatively straightforward.
You can further restrict the returned result sets by filtering with either the WHERE or HAVING clause as appropriate.
I’ll briefly return to some gotchas covered in the previous blog post concerning filtering with an aggregate function.


* Reminder:

  • Aggregate functions cannot be used in filtering with the WHERE clause.
  • Instead, use HAVING for filtering results sets using aggregate functions.

With that refresher, let’s test these functions with a couple of examples.

To determine the average SALARY for all rows combined for the JOB_ID role of 'IT_PROG':

mysql> SELECT AVG(SALARY)
    -> FROM employees
    -> WHERE JOB_ID = 'IT_PROG';
+-------------+
| AVG(SALARY) |
+-------------+
| 5760.000000 |
+-------------+
1 row in set (0.00 sec)

What is the SALARY amount combined for all employees with the 'IT_PROG' job?
SUM() is a great candidate to help answer this type of question.

mysql> SELECT SUM(SALARY)
    -> FROM employees
    -> WHERE JOB_ID = 'IT_PROG';
+-------------+
| SUM(SALARY) |
+-------------+
|    28800.00 |
+-------------+
1 row in set (0.01 sec)

How about using an aggregate as the filter, versus a JOB_ID?
We can determine which department(s) have a total SALARY greater than 30000 by filtering with a HAVING clause:

mysql> SELECT DEPARTMENT_ID
    -> FROM employees
    -> GROUP BY DEPARTMENT_ID
    -> HAVING SUM(SALARY) > 30000;
+---------------+
| DEPARTMENT_ID |
+---------------+
|            50 |
|            80 |
|            90 |
|           100 |
+---------------+
4 rows in set (0.00 sec)

Remember, we must use GROUP BY since we are naming a non-aggregated coulumn –DEPARTMENT_ID– in the SELECT list.

Let’s apply DISTINCT to know how it affects SUM() and AVG().
Using COUNT() we can ascertain the number of SALARY column rows in addition to, those unique values leveraging DISTINCT with the next SELECT queries:

mysql> SELECT COUNT(SALARY)
    -> FROM employees;
+---------------+
| COUNT(SALARY) |
+---------------+
|           107 |
+---------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(DISTINCT SALARY)
    -> FROM employees;
+------------------------+
| COUNT(DISTINCT SALARY) |
+------------------------+
|                     57 |
+------------------------+
1 row in set (0.00 sec)

Finishing up, below is an example of each of the aggregate functions we’ve targeted in this blog post using DISTINCT:

mysql> SELECT SUM(DISTINCT SALARY)
    -> FROM employees;
+----------------------+
| SUM(DISTINCT SALARY) |
+----------------------+
|            397900.00 |
+----------------------+
1 row in set (0.00 sec)
mysql> SELECT AVG(DISTINCT SALARY)
    -> FROM employees;
+----------------------+
| AVG(DISTINCT SALARY) |
+----------------------+
|          6980.701754 |
+----------------------+
1 row in set (0.00 sec)

Before closing this blog post, let’s solidify our understanding of the functionality provided by the HAVING and WHERE clauses. These examples, aim to distinguish between the two, clearing up any potential confusion.

The WHERE clause does not perform this type of filtering:

mysql> SELECT EMPLOYEE_ID
    -> FROM employees
    -> WHERE AVG(SALARY) > 25000;
ERROR 1111 (HY000): Invalid use of group function

Likewise, HAVING does not operate in this manner:

mysql> SELECT EMPLOYEE_ID
    -> FROM employees
    -> HAVING JOB_ID = 'IT_PROG';
ERROR 1054 (42S22): Unknown column 'JOB_ID' in 'having clause'

Coming Up

So far in this series, I have visited: COUNT(),SUM(), and AVG() aggregate functions. Next up, I will explore two other powerful functions from the aggregate family: MAX() and MIN(). Hope to see you there.

Visit the official MySQL 5.7 On-line Manual for more information.

A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well.

Visit the Portfolio-Projects page to see blog post/technical writing I have completed for clients.

Have I mentioned how much I love a cup of coffee?!?!

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.


Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

One thought on “Aggregate Functions part 2: SUM() and AVG() in MySQL with examples.