5 Data Types5 Data Types
5.3 PostgreSQL Data Types PostgreSQL Data Types
5.3.11 Date/Time Proc Date/Time Processing and Que essing and Querrying ying
5.3.11 Date/Time Proc Date/Time Processing and Que essing and Querrying ying
Once the application's data, or rather the user data is properly stored as timestamp with time zone, PostgreSQL allows implementing all the processing you need to.
As an example data set this time we're playing with git git history. The PostgreSQL and pgloader project history have been loaded into the commitlogcommitlog table thanks to the git git loglog command, with a custom format, and some post-processing — properly splitting up the commit's subjects and escaping its content. Here's for example the most recent commit registered in our localcommitlogcommitlog table:
select
select project, hashhash, author, ats, committer, cts, subject
from
from commitlog
where
where project = 'postgresql'
order
order byby ats descdesc limit limit 1;
The column namesatsats and ctscts respectively stand for author commit timestampauthor commit timestamp and committer commit timestamp
committer commit timestamp, and thesubject subject is the first line of the commit message, as per the git log git log format%s%s.
To get the most recent entry from a table weorder by order by dates indescendingdescending order thenlimilimi the result set to a single entry, and we get a single line of output:
─[ RECORD 1 ]─────────────── ────────────────── ─────────────────── ────────────
project │ postgresql
hash │ b1c2d76a2fcef812 af0be3343082414d40 1909c8 author │ Tom Lane
ats │ 2017-08-19 19:39:37+02 committer │ Tom Lane
cts │ 2017-08-19 19:39:51+02
subject │ Fix possible core dump in parallel restore when using a TOC list.
With timestamps, we can compute time-based reporting, such as how many commits each project received each year in their whole history:
select
select extract(year fromfrom ats) asas year,
count(*) filterfilter(wherewhere project = 'postgresql') asas postgresql, count(*) filterfilter(wherewhere project = 'pgloader') asas pgloader
from
As we have only loaded two projects in ourcommitlogcommitlog table, the output is better with a pivo pivo query. We can see more than 20 years of sustained activity for the PostgreSQL project, and a less active project for pgloader:
year │ postgresql │ pgloader
══════╪════════════╪══════════
We can also build a reporting on the repartition of commits by weekday from the beginning of the project, in order to guess if contributors are working on the project on the job only, or mostly during their free time (weekend).
select
select extract(isodow fromfrom ats) asas dow, to_char(ats, 'Day') asas day, count(*) asas commits,
round(100.0*count(*)/sum(count(*)) overover(), 2) asas pct, repeat('■', (100*count(*)/sum(count(*)) overover())::int) asas hist
from
It seems that our PostgreSQL committers tend to work whenever they feel like it, but less so on the weekend. The project's lucky enough to have a solid team of committers being paid to work on PostgreSQL: commit timestamp. Those are different, but by how much?
with
percentile_cont(array[0.5, 0.9, 0.95, 0.99]) within groupgroup(orderorder byby cts-ats) asas parr
from
from
from perc_arrays;
Here's a detailed output of the time difference statistics, per project:
─[ RECORD 1 ]────────────────────────────── ─────
project │ pgloader
average │ @ 4 days 22 hours 7 mins 41.18 secs median │ @ 5 mins 21.5 secs
%90th │ @ 1 day 20 hours 49 mins 49.2 secs
%95th │ @ 25 days 15 hours 53 mins 48.15 secs
%99th │ @ 169 days 24 hours 33 mins 26.18 secs
═[ RECORD 2 ]══════════════════════════════ ═════
project │ postgres
average │ @ 1 day 10 hours 15 mins 9.706809 secs median │ @ 2 mins 4 secs
%90th │ @ 1 hour 46 mins 13.5 secs
%95th │ @ 1 day 17 hours 58 mins 7.5 secs
%99th │ @ 40 days 20 hours 36 mins 43.1 secs
Reporting is a strong use case for SQL. Application will also send more classic queries. We can show the commits for the PostgreSQL project for the 1st of June 2017:
\setset day '2017-06-01'
select
select ats::time,
substring(hashhash fromfrom 1 forfor 8) asashashhash,
substring(subject fromfrom 1 forfor 40) || '…' asas subject
from
from commitlog
where
where project = 'postgresql'
and
and ats >= date :'day'
and
and ats < date :'day' +interval '1 day'
order order byby ats;
It's tempting to use thebetweenbetween SQL operator, but we would then have to remember that between
between includes both its lower and upper bound and we would then have to compute the upper bound as the very last instant of the day. Using explicit greater greater than or than or equalequal andlessless than
than operators makes it possible to always compute the very first time of the day, which is easier, and well supported by PostgreSQL.
Also, using explicit bound checks allows ys to use a single date literal in the query, so that's a single parameter to send from the application.
ats │ hash │ subject
══════════╪════════ ══╪═══════════════ ══════════════════ ══════════
01:39:27 │ 3d79013b │ Make ALTER SEQUENCE, including RESTART, … 02:03:10 │ 66510455 │ Modify sequence catalog tuple before inv…
04:35:33 │ de492c17 │ doc: Add note that DROP SUBSCRIPTION dro…
19:32:55 │ e9a3c047 │ Always use -fPIC, not -fpic, when buildi…
23:45:53 │ f112f175 │ Fix typo…
(5 rows)
Manydata typ e formatting fun ctions are available in PostgreSQL. In the previous querym although we chose tocast cast our timestamp with time zone entry down to atimetime value, we could have chosen another representation thanks to theto_char to_char function:
set
set lc_time toto 'fr_FR';
select
select to_char(ats, 'TMDay TMDD TMMonth, HHam' ) asas time, substring(hashhash fromfrom 1 forfor 8) asashashhash,
substring(subject fromfrom 1 forfor 40) || '…' asas subject
from
from commitlog
where
where project = 'postgresql'
and
and ats >= date :'day'
and
and ats < date :'day' +interval '1 day'
order order byby ats;
And this time we have a French localized output for the time value:
time │ hash │ subject
═══════════════════ ══╪══════════╪════ ══════════════════ ═══════════════════ ══
Jeudi 01 Juin, 01am │ 3d79013b │ Make ALTER SEQUENCE, including RESTART, … Jeudi 01 Juin, 02am │ 66510455 │ Modify sequence catalog tuple before inv…
Jeudi 01 Juin, 04am │ de492c17 │ doc: Add note that DROP SUBSCRIPTION dro…
Jeudi 01 Juin, 07pm │ e9a3c047 │ Always use -fPIC, not -fpic, when buildi…
Jeudi 01 Juin, 11pm │ f112f175 │ Fix typo…
(5 rows)
Take some time to familiarize yourself with the time and date support that PostgreSQL comes with out of the box. Some very useful functions such asdate_trunc()date_trunc() are not shown here, and you also will find more gems.
While most programming languages nowadays include the same kind of feature set, having this processing feature set right in PostgreSQL makes sense in several use cases:
It makes sense when the SQL logic or filtering you want to implement depends on the result of the processing (e.g. grouping by week).
When you have several applications using the same logic, it's often easier to share a SQL query than to set up a distributed service API offering the same result in XML or JSON (a data format you then have to parse).
When you want to reduce your run-time dependencies, it's a good idea to understand how much each architecture layer is able to support in your implementation.