• No results found

Consistency and Data Type Behavior Consistency and Data Type Behavior

5 Data Types5 Data Types

5.2 Some Relational Theo ational Theorryy

5.2.2 Consistency and Data Type Behavior Consistency and Data Type Behavior

5.2.2 Consistency and Data Type Behavior

A key aspect of PostgreSQL data types lies in their behavior. Comparable to anobject- object-oriented

oriented system, PostgreSQL implements functions and operator polymorphism, allowing for the dispatching of code at run-time depending on the types of arguments.

If we have a closer look at a very simple SQL query, we can see lots happening under the hood:

select

select code fromfrom drivers wherewhere driverid = 1;

In this query, the expressiondriverid = 1driverid = 1 uses the== operator in between a column name and a literal value. PostgreSQL knows from its catalogs that thedriverid driverid column is abigint bigint and parses the literal11 as an integer. We can check that with the following query:

select

select pg_typeof(driverid), pg_typeof( 1) fromfrom drivers limitlimit 1;

pg_typeof │ pg_typeof

═══════════╪═══════════

bigint │ integer (1 row)

Now, how does PostgreSQL implements== in between an 8 bytes integer and a 4 bytes integer? Well it turns out that this decision is dynamic: the operator== dispatches to an established function depending on the types of its left and right operands. We can even have a look at the PostgreSQL catalogs to get a better grasp of this notion:

select

select oprname, oprleft::regtype, oprcode::regproc

from

from pg_operator

where

where oprname = '='

and

and oprleft::regtype::text ~'int|time|text|circle|ip'

order

order byby oprleft;

This gives us a list of the following instances of the= operator:

oprname │ oprleft oprcode

═════════╪═════════ ══════════════════ ══╪═══════════════ ═══════════

= │ bigint │ int84eq

= │ bigint │ int8eq

= │ bigint │ int82eq

= │ smallint │ int28eq

= │ smallint │ int2eq

= │ smallint │ int24eq

= │ int2vector │ int2vectoreq

= │ integer │ int48eq

= │ integer │ int42eq

= │ integer │ int4eq

= │ text │ texteq

= │ abstime │ abstimeeq

= │ reltime │ reltimeeq

= │ tinterval │ tintervaleq

= │ circle │ circle_eq

= │ time without time zone │ time_eq = │ timestamp without time zone │ timestamp_eq = │ timestamp without time zone │ timestamp_eq_date = │ timestamp without time zone │ timestamp_eq_times tamptz

= │ timestamp with time zone │ timestamptz_eq_tim estamp

= │ timestamp with time zone │ timestamptz_eq

= │ timestamp with time zone │ timestamptz_eq_dat e

= │ interval │ interval_eq

= │ time with time zone │ timetz_eq (24 rows)

The previous query limits its output to the datatype expected on theleft left of the operator. Of course, the catalogs also store the datatype expected on theright right of it, and the result type too, which isBooleanBoolean in the case of equality. Theoprcodeoprcode column in the output is the name of the PostgreSQL function that is run when the operator is used.

In our case withdriverid = 1driverid = 1 PostgreSQL is going to use theint84eqint84eq function to implement our query. This is true unless there's an index ondriverid driverid of course, in which case PostgreSQL will walk the index to find matching rows without comparing the literal with the table's content, only with the index content.

When using PostgreSQL, data types provide the following:

Input data representation, expected in input literal values Output data representation

A set of functions working with the data type

Specific implementations of existing functions for the new data type

Operator specific implementations for the data type Indexing support for the data type

The indexing support for PostgreSQL covers several kinds of indexes:B-treeB-tree,GiST GiST ,GINGIN,SP- SP-GiST

GiST ,hashhash andbrinbrin. This book doesn't fo further and cover the details of each of those index types. As an example of data type support for some indexes and the relationship in between a data type, a support function, an operator and an index, we can have a look at theGiST GiST support for theip4r ip4r extension data type:

select

select amopopr::regoperator

from

from pg_opclass c

join

join pg_am am onon am.oid = c.opcmethod

join

join pg_amop amop onon amop.amopfamily = c.opcfamily

where

where opcintype = 'ip4r'::regtype

and

and am.amname = 'gist';

The pg_opclass pg_opclass catalog is a list ofoperator classoperator class, each of them belongs to anoperator family operator family as found in the pg_opfamil pg_opfamily y catalog. Each index type implements anaccess method access method represented in the pg_am pg_am catalog. Finally, each operator that may be used in relation to an index access method is listed in the pg_amop pg_amop catalog.

Knowing that we can access the PostgreSQL catalogs at run-time and discover theip4r ip4r supported operators for aGiST GiST indexed lookup:

amopopr

════════════════

>>=(ip4r,ip4r) <<=(ip4r,ip4r) >>(ip4r,ip4r) <<(ip4r,ip4r) &&(ip4r,ip4r) =(ip4r,ip4r) (6 rows)

Those catalog queries are pretty advanced material that you don't need in your daily life as an application developer. That said, it's good to have some understanding of how things work in PostgreSQL as it allows a smarter usage of the system you are already relying on for your data.

What we've seen here is that PostgreSQL implementation of data types is a completely dynamic system with function and operator dispatch, and PostgreSQL extension authors have APIs they can use to register new indexing support at run time (when you type increatecreate extension

extension).

The goal of understanding that is for you, as an application developer, to understand how much can be done in PostgreSQL thanks to the integral concept of data type.