PostgreSQL’s Aggregate functions can be used to solve and answer many common questions and problems, allowing us to present our data in meaningful ways. Today we will look at an example of how to use some of these powerful functions for inventory or asset tracking needs.
Note: All data, names or naming found within the database presented in this post, are strictly used for practice, instruction and testing purposes. It by no means depicts actual data belonging to or being used by any party or organization.
I will be using Xubuntu Linux 16.04.2 LTS (Xenial Xerus) and PostgreSQL 9.6.2 for these exercises.
The example mock data found in our PostgreSQL database will depict typical pipeline pipe tally data.
First, let’s see how many records we have in our pipe_tally2 table with the COUNT aggregate function:
When running this command:
SELECT COUNT(*) FROM pipe_tally2;
on the pipe_tally2 table, we are returned the total number of input rows found in the table. The * means ‘all’ rows here. We can see we have 25 records in this table. (This is very handy when we do not know how much data or how many rows we are working with for an unfamiliar table.)
If we wanted to get an idea of what data is present in our table and the structure of it with the name of the columns, some sample data, and an overall feel, but only returning a portion of the rows, we can use the LIMIT keyword in our query like so:
SELECT * FROM pipe_tally2 LIMIT 5;
By specifying the number 5 after the LIMIT keyword, we are saying we want no more than 5 rows returned (it is possible for fewer rows to be returned if the query itself yields fewer rows).
Note: Quoting the online PostgreSQL documentation: “it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise, you will get an unpredictable subset of the query’s rows.” (However, for the purposes of this example, we are using LIMIT to get an idea of what is contained in our table, and what format or structure is present, not necessarily any certain order.)
Perhaps the wall_thick column is of interest to us. Do all the records in this table have a wall_thick value of 0.444? Let’s see with this query:
SELECT DISTINCT wall_thick AS wall_thickness FROM pipe_tally2;
By using SELECT DISTINCT in our query on the wall_thick column (I’ve given the wall_thick column a temporary name here, wall_thickness, with the AS keyword), we are saying we do not want any duplicate rows returned for the wall_thick column. DISTINCT does just that by returning three distinct values from the wall_thick column: 0.543, 0.444 and 0.283.
I wonder how many records for each wall thickness value, do we have present in this table of 25 records?
Again we can use one of PostgreSQL’s most useful aggregate functions to answer this for us. Using the COUNT function in the following query gives us some interesting results:
SELECT DISTINCT wall_thick AS wall_thickness, COUNT(*) AS amount_on_site FROM pipe_tally2 GROUP BY amount_on_site;
By using the COUNT(*) aggregate function here in our query, we are saying we want all of the records returned for each distinct value in the wall_thickness column. I’ve temporarily named this column amount_on_site (perhaps this tally was derived from a facility or site) and we can see here we have 14 records for wall_thickness 0.543, 6 records for wall_thickness 0.444 and 5 records for wall_thickness 0.283.
We are also introduced to the GROUP BY clause in this query. The GROUP BY clause combines each set of rows having common values into one group row that represents all rows in the group. Basically GROUP BY takes all selected rows that share the same values for the grouped expressions and places them in a single row.
Now that we have extracted some useful data from our table, suppose that the question is posed, “How much footage do we have in 0.444 wall thickness?”
PostgreSQL’s SUM aggregate function will take care of that for us.
SELECT DISTINCT wall_thick AS wall_thickness, COUNT(*) AS amount_on_site, SUM(lt) AS approx_footage FROM pipe_tally2 GROUP BY wall_thick ORDER BY amount_on_site;
The SUM function returns a sum of expression across all input values.
By calling the SUM function on the lt column (which contains lengths for each record of pipe, and is temporarily named here as approx_footage) and combining it with the COUNT function and DISTINCT expression, we are able to obtain the total amount, in footage (if that is indeed our unit of measure) for all records, per wall_thickness value, for each distinct wall_thickness value in our table.
Notice here an ORDER BY clause at the end of our query. The ORDER BY clause causes the result rows to be sorted according to the specified expression(s). Adding the keyword ASC (ascending) or DESC (descending) after the expression will sort the result in that order. By running:
SELECT DISTINCT wall_thick AS wall_thickness, COUNT(*) AS amount_on_site, SUM(lt) AS approx_footage FROM pipe_tally2 GROUP BY wall_thick ORDER BY amount_on_site ASC;
With the explicit ASC keyword, we can see this is the default behavior for ORDER BY. By including the ASC keyword, it returns the same results as our previous query where we did not specify it. Should we desire a descending order, we can obtain the data that way as well with the explicit DESC keyword like this:
SELECT DISTINCT wall_thick AS wall_thickness, COUNT(*) AS amount_on_site, SUM(lt) AS approx_footage FROM pipe_tally2 GROUP BY wall_thick ORDER BY amount_on_site DESC;
The amount_on_site column is now ordered by descending order.
Hopefully, through this example, it is clear how utilizing some of PostgreSQL’s powerful aggregate functions along with some of its other expressions, we can begin to extract some useful and insightful data from our database tables. We have only explored but a small fraction of these available functions. There are surely many that will suit your needs in extracting the data you need from your database tables and enabling you to present or study it in meaningful ways.
My hope is that you would find something interesting and useful from this post. I look forward to and welcome any comments or questions.
Thank you all for reading.
To receive notifications for the latest post from this blog via email, please subscribe by clicking the ‘Follow Digital Owl’s Prose’ button in the sidebar!