Pandamator

dsouflis at acm dot org Project managed with Flying Donut

WHY WOULD YOU CARE ABOUT TEMPORAL DATABASES?

«Par ma foi! il y a plus de quarante ans que je dis de la prose sans que j'en susse rien…»
Molière, Le Bourgeois Gentilhomme

“By my faith! For more than forty years I have been speaking prose without knowing anything about it…”
Molière, The Middle Class Gentleman

Because you've probably been using temporal databases all your life. I know I have. Every database where you record a history of past data or the scheduling of future events, is a temporal database. Every table where you add datetime columns that annotate the entire row, as in marking when an entity first appeared or when it was removed, is a temporal table.

WHAT IS SO DIFFICULT ABOUT THEM?

Common RDBMSs do not offer any support for defining and manipulating temporal data. Although temporal data can be defined and manipulated using plain SQL, its amount and complexity is an order of magnitude more than for non-temporal data. What's more, supporting tools like ORM and report generators do not support temporal data either, which means that developers have to do everything on their own.

Temporal databases are mostly absent from a developer's education. And, because adding some datetime columns does not seem so big a deal, it is very easy to move into temporal databases without acknowledging the difficulties and the pitfalls. And get it all wrong. No integrity checking, no guidelines to implement updates correctly, lots of procedural code that manipulates data in memory, and lots of code to overcome inconsistent data that, invariably, appear in the database.

PANDAMATOR

It is far better to do it correctly from the beginning. Pandamator (πανδαμάτωρ, ancient Greek for all-subduing, an adjective commonly attached to time), is a project aiming to enable handling of temporal databases using common RDBMSs lacking temporal support. It consists of:

Predetermined decisions as to what kinds of temporal data can be accommodated and their representation in SQL
Definitions of tables, views, functions and stored procedures that together provide a supporting infrastructure for defining and using temporal tables
An SQL generator that generates triggers for maintaining temporal referential integrity and other procedural code
Run-time support to create UPDATE and DELETE scripts in user programs

Temporal Support in Five Steps

Prepare the database
- Run CREATE VALIDTIME to define the metadata tables, supporting data and code
Create your schema
- As usual, but adding the extra columns needed
Declare temporal annotations on the schema
- For example ALTER TABLE X ADD VALIDTIME(STATE); ALTER TABLE X ADD VALIDTIME CONSTRAINT PK PRIMARY KEY(IDCOL);
Generate SQL for integrity constraints
- Run ALTER VALIDTIME
Create temporal-aware SQL in your code
- For example VALIDTIME SELECT A.X FROM A INNER JOIN B ON A.BID=B.ID

FACTS ABOUT TEMPORAL DATA

No simple SQL integrity constraint suffices to ensure the integrity of a temporal database. Lacking ASSERTIONs, we can only resort to TRIGGERs to define them.

Primary keys and foreign keys need complex, multiline queries to check them.

Updates are hard, because of all the rearranging of existing data that must be done to maintain the time dimension.

Join, Not exist, Outer join, are especially hard, because of the multitude of ways temporal intervals interact.

On the other hand, querying the database for the state of its data at a specific moment in time is relatively easy, which may mislead newcomers as to the difficulty of using temporal data.

Once we add the dimension of time, completely new kinds of queries emerge, which requires a new mindset with respect to working with non-temporal data.