
Photo by Delano Balten on Unsplash
Want to automate a process and ensure an operation is carried out for your database tables? Perhaps data integrity checks? Beginning with this blog post, I will present a two-part series leveraging MySQL triggers to ensure data integrity for mock pipeline pipe asset data.
Note: All data, names or naming found within the database presented in this post, are strictly used for practice, learning, instruction, and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.
I will be using Xubuntu Linux 16.04.3 LTS (Xenial Xerus) and MySQL 5.7.20 for these exercises.
Some Background
To start, let’s go over the assumed given for this project. We are aware of what data should be collected and turned in for processing.
Yet, we as humans make mistakes, hence the example TRIGGER
will be used as a data integrity CHECK CONSTRAINT
of sorts. In truth, all data should be checked and ‘cleared’ prior to storage, however for the purpose of this blog post, I have chosen only 2 values to monitor.
Just know this is not a complete and total solution. Only a general outline.
Goals and Objectives
- Introduce the involved tables and data set.
- Define and create a
TRIGGER
to ensure specific data integrity checks. - Import data into the staging table.
- Verify the
TRIGGER
works as expected. - Review any flagged records.
A description for both tables in this example is shown below:
*************************** 1. row ***************************
Table: asset_staging
Create Table: CREATE TABLE `asset_staging` (
`pipe_name` varchar(20) DEFAULT NULL,
`pipe_jt_num` varchar(20) DEFAULT NULL,
`pipe_heat` varchar(20) DEFAULT NULL,
`pipe_length` decimal(4,2) DEFAULT NULL,
`pipe_wall_thickness` decimal(4,3) DEFAULT NULL,
`degree` decimal(4,2) unsigned DEFAULT NULL,
`manufacturer` varchar(35) DEFAULT NULL,
`coating_type` varchar(25) DEFAULT NULL,
`coating_thickness` varchar(25) DEFAULT NULL,
`diameter` decimal(3,1) unsigned DEFAULT NULL,
`pipe_grade` varchar(5) DEFAULT NULL,
`kind` enum('flange','bend','other','pipe_joint','tee','valve') DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
*************************** 1. row ***************************
Table: flagged_asset
Create Table: CREATE TABLE `flagged_asset` (
`f_pipe_name` varchar(20) DEFAULT NULL,
`f_pipe_jt_num` varchar(20) DEFAULT NULL,
`f_pipe_heat` varchar(20) DEFAULT NULL,
`f_pipe_length` decimal(4,2) DEFAULT NULL,
`f_defect` text,
`f_message` text,
`import_date` datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
Table asset_staging
is an initial stop for collected data as it enters the database. Here the data will be checked with the TRIGGER
and cleaned before further processing.
Should any records coming in not meet the requirements, vital information will be stored in the flagged_asset
table.
Purpose of the Trigger?
This TRIGGER
‘s role is to provide a data integrity check for any newly imported records. Better known as a CHECK CONSTRAINT
, which MySQL does not support as of this writing (Boo!!!).
In particular, pipe_length
and kind
columns are of high interest.
What are we trying to ensure here?
The pipe_length
column holds…
Wait for it…
The length of a piece of pipe!!!
Ensuring this value is positive and at least greater than 0 (zero) in length is an excellent integrity check right?
I mean how can a piece of pipe not have a length?
Chances are this won’t be an issue since who in their right mind would collect a negative value for something that obviously has a length?
Make note the kind
column is an ENUM
data type.
We are expecting collected pipe data will be a type of one of the acceptable values for this column, specified during table creation. Those values are drawn from the below list.
'flange','bend','other','pipe_joint','tee','valve'
But what if it isn’t?
Shouldn’t we know that firsthand before allowing that record into any production table(s)? Correct-a-mundo. That is the second data integrity check.
What will we do with any bad data being imported?
Should either of the two monitored columns have unacceptable data, substantive information for that row will be inserted and stored in the flagged_asset
table with a meaningful message.
Times And Events
Triggers act, or fire as it is commonly referred to, at a specific time. Which can be either BEFORE
or AFTER
an event.
Those events are:
INSERT
. Applies toLOAD DATA
andREPLACE
commands as well.UPDATE
DELETE
. Includes theREPLACE
command.
To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.
Putting Triggers To Work
With the groundwork in place, we have defined the need for this TRIGGER
. Lets’ put that into action and create one to accomplish the established requirements.
We can start with this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | DELIMITER $$ CREATE TRIGGER collect_to_stage BEFORE INSERT ON asset_staging FOR EACH ROW BEGIN IF NEW.kind = 0 THEN SET NEW.kind = 'other'; INSERT INTO flagged_asset(f_pipe_name, f_pipe_jt_num, f_pipe_heat, f_pipe_length, f_defect, f_message, import_date) VALUES(NEW.pipe_name, NEW.pipe_jt_num, NEW.pipe_heat, NEW.pipe_length, 'Unknown Type', CONCAT(NEW.pipe_name, ' was not in the acceptable list.'), NOW()); END IF; IF NEW.pipe_length < = 0 THEN INSERT INTO flagged_asset(f_pipe_name, f_pipe_jt_num, f_pipe_heat, f_pipe_length, f_defect, f_message, import_date) VALUES(NEW.pipe_name, NEW.pipe_jt_num, NEW.pipe_heat, NEW.pipe_length, 'Questionable Length', CONCAT(NEW.pipe_name, ' needs the length verified for accuracy.'), NOW()); END IF; END $$ DELIMITER ; |
With the collect_to_stage
TRIGGER
created, we’ll load collected data into the asset_staging
table for testing.
The below images are two columns from the bad_asset_my.csv
file containing 13 rows to insert.
Among the pipe_length
column values shown above, are those that should fire off the trigger.
Likewise, unsatisfactory values for the kind
column have been captured as seen below.
Bear in mind, a LOAD DATA INFILE
statement fires as an INSERT
event since it reads rows from a text file into a table.
Let’s proceed, using LOAD DATA INFILE
for a bulk upload:
-> COLUMNS TERMINATED BY ','
-> OPTIONALLY ENCLOSED BY '"'
-> ESCAPED BY '"'
-> LINES TERMINATED BY '\n'
-> IGNORE 1 LINES;
Query OK, 13 rows affected, 3 warnings (0.26 sec)
Records: 13 Deleted: 0 Skipped: 0 Warnings: 3
Hmmm…
MySQL reported 3 warnings for that upload. Let’s look at those.
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1265 | Data truncated for column 'kind' at row 2 |
| Warning | 1265 | Data truncated for column 'kind' at row 7 |
| Warning | 1265 | Data truncated for column 'kind' at row 12 |
+---------+------+--------------------------------------------+
3 rows in set (0.00 sec)
Data truncated for column ‘kind’ at…
Isn’t that the rows with the unacceptable kind
column values?
MySQL let us know something is not quite right with data imported for the kind
column.
Did the trigger work?
We can determine any success, by querying the flagged_asset
table. All records not passing the checks are stored there.
For brevity and output’s sake, I will query only 4 columns:
-> FROM flagged_asset;
+-------------+---------------+---------------------+----------------------------------------------------+
| f_pipe_name | f_pipe_length | f_defect | f_message |
+-------------+---------------+---------------------+----------------------------------------------------+
| Joint-88 | 16.60 | Unknown Type | Joint-88 was not in the acceptable list. |
| Joint-2228 | -11.50 | Questionable Length | Joint-2228 needs the length verified for accuracy. |
| Joint-7834 | 44.40 | Unknown Type | Joint-7834 was not in the acceptable list. |
| Joint-4841 | 0.00 | Questionable Length | Joint-4841 needs the length verified for accuracy. |
| Joint-2128 | 13.55 | Unknown Type | Joint-2128 was not in the acceptable list. |
| Joint-222B | 0.00 | Questionable Length | Joint-222B needs the length verified for accuracy. |
+-------------+---------------+---------------------+----------------------------------------------------+
6 rows in set (0.00 sec)
Sweet. The TRIGGER
works as expected, catching the three incorrect pipe_length
values and the three kind
column values not found in the list.
Both f_defect
and f_message
columns report what is wrong with each record. A defect of ‘Unknown Type’ is stored for a value not accepted in the kind
column, while ‘Questionable Length’ covers those records with a length not meeting the requirement.
Finally, f_message
stores a meaningful message concerning the defect, along with the pipe_name
for that individual record.
Up Next
This is my initial dive into triggers and their use in this context for data integrity checks. My utmost hope is the content provided here, will assist you in implementing a similar type TRIGGER
.
I would love to hear feedback from those of you more experienced with triggers on any tips, tricks, best practices, or gotchas.
It is important to understand, this solution may not be the best possible for your particular needs, only an example of what can be accomplished for data integrity checks with a TRIGGER
. I look forward to publishing Part 2, where I delve into the actual trigger and get a gist of the behind-the-scenes workings. Hope to see you there.
Explore the official MySQL 5.7 Online Manual for an in-depth overview of all topics covered in this post.
A Call To Action!
Thank you for taking the time to read this post. I truly hope you discovered something interesting and enlightening. Please share your findings here, with someone else you know who would get the same value out of it as well. Have I mentioned how much I love a cup of coffee?!?!
To receive notifications for the latest post from “Digital Owl’s Prose” via email, please subscribe by clicking the ‘Click To Subscribe!’ button in the sidebar!
Be sure and visit the “Best Of” page for a collection of my best blog posts.
Josh Otwell has a passion to study and grow as a SQL Developer and blogger. Other favorite activities find him with his nose buried in a good book, article, or the Linux command line. Among those, he shares a love of tabletop RPG games, reading fantasy novels, and spending time with his wife and two daughters.
Disclaimer: The examples presented in this post are hypothetical ideas of how to achieve similar types of results. They are not the utmost best solution(s). The majority, if not all, of the examples provided are performed on a personal development/learning workstation-environment and should not be considered production quality or ready. Your particular goals and needs may vary. Use those practices that best benefit your needs and goals. Opinions are my own.
2 thoughts on “MySQL Triggers for Data Integrity – In lieu of Check Constraints Part 1.”