MySQL Triggers for Data Integrity – In lieu of Check Constraints Part 1.

delano-balten-348814

Photo by Delano Balten on Unsplash


Want to automate a process and ensure an operation is carried out for your database tables? Perhaps data integrity checks? Beginning with this blog post, I will present a two-part series leveraging MySQL triggers to ensure data integrity for mock pipeline pipe asset data.

Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.

I will be using Xubuntu Linux 16.04.3 LTS (Xenial Xerus) and MySQL 5.7.20 for these exercises.

Some Background

To start, let’s go over the assumed given for this project. We are aware of what data should be collected and turned in for processing.
Yet, we as humans make mistakes, hence the example TRIGGER will be used as a data integrity CHECK CONSTRAINT of sorts. In truth, all data should be checked and ‘cleared’ prior to storage, however for the purpose of this blog post, I have chosen only 2 values to monitor.
Just know this is not a complete and total solution. Only a general outline.

Goals and Objectives

  1. Introduce the involved tables and data set.
  2. Define and create a TRIGGER to ensure specific data integrity checks.
  3. Import data into the staging table.
  4. Verify the TRIGGER works as expected.
  5. Review any flagged records.

A description for both tables in this example is shown below:

mysql> SHOW CREATE TABLE asset_staging\G
*************************** 1. row ***************************
       Table: asset_staging
Create Table: CREATE TABLE `asset_staging` (
  `pipe_name` varchar(20) DEFAULT NULL,
  `pipe_jt_num` varchar(20) DEFAULT NULL,
  `pipe_heat` varchar(20) DEFAULT NULL,
  `pipe_length` decimal(4,2) DEFAULT NULL,
  `pipe_wall_thickness` decimal(4,3) DEFAULT NULL,
  `degree` decimal(4,2) unsigned DEFAULT NULL,
  `manufacturer` varchar(35) DEFAULT NULL,
  `coating_type` varchar(25) DEFAULT NULL,
  `coating_thickness` varchar(25) DEFAULT NULL,
  `diameter` decimal(3,1) unsigned DEFAULT NULL,
  `pipe_grade` varchar(5) DEFAULT NULL,
  `kind` enum('flange','bend','other','pipe_joint','tee','valve') DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> SHOW CREATE TABLE flagged_asset\G
*************************** 1. row ***************************
       Table: flagged_asset
Create Table: CREATE TABLE `flagged_asset` (
  `f_pipe_name` varchar(20) DEFAULT NULL,
  `f_pipe_jt_num` varchar(20) DEFAULT NULL,
  `f_pipe_heat` varchar(20) DEFAULT NULL,
  `f_pipe_length` decimal(4,2) DEFAULT NULL,
  `f_defect` text,
  `f_message` text,
  `import_date` datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Table asset_staging is an initial stop for collected data as it enters the database. Here the data will be checked with the TRIGGER and cleaned before further processing.
Should any records coming in not meet the requirements, vital information will be stored in the flagged_asset table.

Purpose of the Trigger?

This TRIGGER‘s role is to provide a data integrity check for any newly imported records. Better known as a CHECK CONSTRAINT, which MySQL does not support as of this writing (Boo!!!).

In particular, pipe_length and kind columns are of high interest.

What are we trying to ensure here?

The pipe_length column holds…
Wait for it…
The length of a piece of pipe!!!
Ensuring this value is positive and at least greater than 0 (zero) in length is an excellent integrity check right?

I mean how can a piece of pipe not have a length?
Chances are this won’t be an issue since who in their right mind would collect a negative value for something that obviously has a length?

Make note the kind column is an ENUM data type.

We are expecting collected pipe data will be a type of one of the acceptable values for this column, specified during table creation. Those values are drawn from the below list.
'flange','bend','other','pipe_joint','tee','valve'

But what if it isn’t?
Shouldn’t we know that firsthand before allowing that record into any production table(s)? Correct-a-mundo. That is the second data integrity check.

What will we do with any bad data being imported?

Should either of the two monitored columns have unacceptable data, substantive information for that row will be inserted and stored in the flagged_asset table with a meaningful message.

Times And Events

Triggers act, or fire as it is commonly referred to, at a specific time. Which can be either BEFORE or AFTER an event.

Those events are:

  • INSERT. Applies to LOAD DATA and REPLACE commands as well.
  • UPDATE
  • DELETE. Includes the REPLACE command.

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Putting Triggers To Work

With the groundwork in place, we have defined the need for this TRIGGER. Lets’ put that into action and create one to accomplish the established requirements.
We can start with this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
DELIMITER $$
CREATE TRIGGER collect_to_stage BEFORE INSERT
ON asset_staging
FOR EACH ROW
BEGIN
  IF NEW.kind = 0 THEN
  SET NEW.kind = 'other';
  INSERT INTO flagged_asset(f_pipe_name, f_pipe_jt_num, f_pipe_heat,
    f_pipe_length, f_defect, f_message, import_date)
  VALUES(NEW.pipe_name, NEW.pipe_jt_num, NEW.pipe_heat, NEW.pipe_length,
    'Unknown Type', CONCAT(NEW.pipe_name, ' was not in the acceptable list.'), NOW());
  END IF;
  IF NEW.pipe_length < = 0 THEN
  INSERT INTO flagged_asset(f_pipe_name, f_pipe_jt_num, f_pipe_heat,
    f_pipe_length, f_defect, f_message, import_date)
  VALUES(NEW.pipe_name, NEW.pipe_jt_num, NEW.pipe_heat, NEW.pipe_length,
    'Questionable Length', CONCAT(NEW.pipe_name, ' needs the length verified for accuracy.'), NOW());
  END IF;
END $$
DELIMITER ;


With the collect_to_stage TRIGGER created, we’ll load collected data into the asset_staging table for testing.

The below images are two columns from the bad_asset_my.csv file containing 13 rows to insert.

showLengths_column
Among the pipe_length column values shown above, are those that should fire off the trigger.
Likewise, unsatisfactory values for the kind column have been captured as seen below.

showKind_column

Bear in mind, a LOAD DATA INFILE statement fires as an INSERT event since it reads rows from a text file into a table.
Let’s proceed, using LOAD DATA INFILE for a bulk upload:

mysql> LOAD DATA LOCAL INFILE "/home/user/user_folder/bad_asset_my.csv" INTO TABLE asset_staging
    -> COLUMNS TERMINATED BY ','
    -> OPTIONALLY ENCLOSED BY '"'
    -> ESCAPED BY '"'
    -> LINES TERMINATED BY '\n'
    -> IGNORE 1 LINES;
Query OK, 13 rows affected, 3 warnings (0.26 sec)
Records: 13  Deleted: 0  Skipped: 0  Warnings: 3

Hmmm…
MySQL reported 3 warnings for that upload. Let’s look at those.

mysql> SHOW warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1265 | Data truncated for column 'kind' at row 2  |
| Warning | 1265 | Data truncated for column 'kind' at row 7  |
| Warning | 1265 | Data truncated for column 'kind' at row 12 |
+---------+------+--------------------------------------------+
3 rows in set (0.00 sec)

Data truncated for column ‘kind’ at…
Isn’t that the rows with the unacceptable kind column values?
MySQL let us know something is not quite right with data imported for the kind column.

Did the trigger work?
We can determine any success, by querying the flagged_asset table. All records not passing the checks are stored there.
For brevity and output’s sake, I will query only 4 columns:

mysql> SELECT f_pipe_name, f_pipe_length, f_defect, f_message
    -> FROM flagged_asset;
+-------------+---------------+---------------------+----------------------------------------------------+
| f_pipe_name | f_pipe_length | f_defect            | f_message                                          |
+-------------+---------------+---------------------+----------------------------------------------------+
| Joint-88    |         16.60 | Unknown Type        | Joint-88 was not in the acceptable list.           |
| Joint-2228  |        -11.50 | Questionable Length | Joint-2228 needs the length verified for accuracy. |
| Joint-7834  |         44.40 | Unknown Type        | Joint-7834 was not in the acceptable list.         |
| Joint-4841  |          0.00 | Questionable Length | Joint-4841 needs the length verified for accuracy. |
| Joint-2128  |         13.55 | Unknown Type        | Joint-2128 was not in the acceptable list.         |
| Joint-222B  |          0.00 | Questionable Length | Joint-222B needs the length verified for accuracy. |
+-------------+---------------+---------------------+----------------------------------------------------+
6 rows in set (0.00 sec)

Sweet. The TRIGGER works as expected, catching the three incorrect pipe_length values and the three kind column values not found in the list.
Both f_defect and f_message columns report what is wrong with each record. A defect of ‘Unknown Type’ is stored for a value not accepted in the kind column, while ‘Questionable Length’ covers those records with a length not meeting the requirement.

Finally, f_message stores a meaningful message concerning the defect, along with the pipe_name for that individual record.

Up Next

This is my initial dive into triggers and their use in this context for data integrity checks. My utmost hope is the content provided here, will assist you in implementing a similar type TRIGGER.

I would love to hear feedback from those of you more experienced with triggers on any tips, tricks, best practices, or gotchas.

It is important to understand, this solution may not be the best possible for your particular needs, only an example of what can be accomplished for data integrity checks with a TRIGGER. I look forward to publishing Part 2, where I delve into the actual trigger and get a gist of the behind-the-scenes workings. Hope to see you there.
Explore the official MySQL 5.7 Online Manual for an in-depth overview of all topics covered in this post.


A Call To Action!

Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well. Have I mentioned how much I love a cup of coffee?!?!

To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.


Josh Otwell

Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.


Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.

2 thoughts on “MySQL Triggers for Data Integrity – In lieu of Check Constraints Part 1.