Unix Technical Forum

RFC: Temporal Extensions for PostgreSQL

This is a discussion on RFC: Temporal Extensions for PostgreSQL within the pgsql Hackers forums, part of the PostgreSQL category; --> Temporal Extensions for PostgreSQL by: Warren Turkal I would like to see a comprehensive solution to time varying tables ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 06:59 AM
Warren Turkal
 
Posts: n/a
Default RFC: Temporal Extensions for PostgreSQL

Temporal Extensions for PostgreSQL
by: Warren Turkal

I would like to see a comprehensive solution to time varying tables (or
temporal) in PostgreSQL. I specifically want to see suuport for valid-time and
transacation-time and bitemporal (valid-time and transaction-time) tables. I
will be defering the descriptions of much of the functionality to Dr. Richard
T.
Snodgrass's _Developing Time-Oriented Database Applications in SQL_ at [1].
The
mangled pages 30-31 are at [2].


a) Functionality

Dr. Richard T. Snodgrass has worked on defining semantics of temporal very
completely in several writings. He was also involved in an unsuccessful effort
to standardize temporal extensions to SQL. I believe his book does a good job
in presenting the semantics of temporal databases and describing extensions to
SQL that make the data much more natural with which to work.


b) How current solutions fall flat

Current solutions fall flat due to the extreme complexity of implementing
valid-time and transaction time semantics on tables by adding columns to track
all of the data. Please see chapter 11 of [1] for a more complete description
of
this complexity. Chapter 12 of [1] goes on to lay out new syntax for SQL that
will make dealing with data of this nature much more natural.


c) Examples

--create normal table
CREATE TABLE products
( id SERIAL PRIMARY KEY
, description TEXT
);

-- Add valid-time support to the table with granularity of timestamp.
ALTER TABLE products
ADD VALIDTIME PERIOD(TIMESTAMP WITH TIMEZONE);

-- Insert row valid from 2006-01-01 to just before 2007-01-01
VALIDTIME PERIOD '[2006-01-01 - 2007-01-01)'
INSERT INTO products
( description
)
VALUES
( 'red ball'
);

-- Insert row valid from 2007-01-01 to just before 2008-01-01
-- Should be smart enough to realize the id=777 does not conflict in this time
-- of validity.
VALIDTIME PERIOD '[2007-01-01 - 2008-01-01)'
INSERT INTO products
( id
, description
)
VALUES
( 777
, 'blue ball'
);

-- Select history of products with id=777
VALIDTIME
SELECT *
FROM product
WHERE id=777;

id | description | valid_period
----------------------------------------------
777| red ball | [2006-01-01 - 2007-01-01)
777| blue ball | [2007-01-01 - 2008-01-01)

-- Select current products with id=777
-- The date when query was run was 2007-02-10.
SELECT *
FROM products
WHERE id=777;

id | description
------------------
777| blue ball

There are many more details in chapter 12 of [1].


d) New stuff (dependencies, indices, syntax, libraries)

One of the base level additions is the PERIOD datatype. I think that
implementing temporal support is reliant on developing such a type. The
description of this datatype is laid out in chapter 4 of [1]. The SQL syntax
is
present in chapter 12 of [1]. I see this as the first piece that needs to be
implemented in order to take steps toward a DBMS to supports full temporal
capabilities. I think that PERIOD can largely reuse the datatime functionality
for parsing of literals and for comparisons. The RTREE seems to nicely
incorporate needed indexing of the PERIOD type. The syntax of the parser will
have to be extended to handle the PERIOD literals and constructor. I believe
any
additional libraries will be required.

There are also extensions to the syntax of table creation, table altering,
querying, inserting, and updating on temporal tables. These are all discussed
in
some detail in chapter 12 of [1]. I don't think that any of these changes will
require new libraries.

The semantics of temporal tables and querying them could have a dramatic
affect
on how things like primary keys and unique constraints work. I would like to
get
some comments about this from the community.


e) See Also

Addtional resources can be found at Dr. Richard T. Snodgrass's website at [3],
including SQL valid-time table support spec at [4] and SQL transaction-time
table support spec at [5].

Thoughts? Questions? Comments?

[1]http://www.cs.arizona.edu/~rts/tdbbook.pdf
[2]http://www.cs.arizona.edu/~rts/pp30-31.pdf
[3]http://www.cs.arizone.edu/~rts/
[4]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad146.pdf
[5]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad147.pdf

Thanks,
wt
--
Warren Turkal (w00t)

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 07:07 AM
Jim C. Nasby
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

My suggestion would be to focus on a period data type first and
foremost, as that's something that could be readily used by a lot of
folks. Of particular note, it's difficult to query tables that have
start_time and end_time fields to define a period; it's easy to screw up
the boundary conditions, and it's also hard to make those queries
perform well without going to extra lengths (such as defining a 'bogus'
GiST index on something like box(point(start,start),point(end,end)). And
it's not possible to do that in a way that avoids floating points and
their errors.

On Sat, Feb 10, 2007 at 12:20:28AM -0700, Warren Turkal wrote:
> Temporal Extensions for PostgreSQL
> by: Warren Turkal
>
> I would like to see a comprehensive solution to time varying tables (or
> temporal) in PostgreSQL. I specifically want to see suuport for valid-time and
> transacation-time and bitemporal (valid-time and transaction-time) tables. I
> will be defering the descriptions of much of the functionality to Dr. Richard
> T.
> Snodgrass's _Developing Time-Oriented Database Applications in SQL_ at [1].
> The
> mangled pages 30-31 are at [2].
>
>
> a) Functionality
>
> Dr. Richard T. Snodgrass has worked on defining semantics of temporal very
> completely in several writings. He was also involved in an unsuccessful effort
> to standardize temporal extensions to SQL. I believe his book does a good job
> in presenting the semantics of temporal databases and describing extensions to
> SQL that make the data much more natural with which to work.
>
>
> b) How current solutions fall flat
>
> Current solutions fall flat due to the extreme complexity of implementing
> valid-time and transaction time semantics on tables by adding columns to track
> all of the data. Please see chapter 11 of [1] for a more complete description
> of
> this complexity. Chapter 12 of [1] goes on to lay out new syntax for SQL that
> will make dealing with data of this nature much more natural.
>
>
> c) Examples
>
> --create normal table
> CREATE TABLE products
> ( id SERIAL PRIMARY KEY
> , description TEXT
> );
>
> -- Add valid-time support to the table with granularity of timestamp.
> ALTER TABLE products
> ADD VALIDTIME PERIOD(TIMESTAMP WITH TIMEZONE);
>
> -- Insert row valid from 2006-01-01 to just before 2007-01-01
> VALIDTIME PERIOD '[2006-01-01 - 2007-01-01)'
> INSERT INTO products
> ( description
> )
> VALUES
> ( 'red ball'
> );
>
> -- Insert row valid from 2007-01-01 to just before 2008-01-01
> -- Should be smart enough to realize the id=777 does not conflict in this time
> -- of validity.
> VALIDTIME PERIOD '[2007-01-01 - 2008-01-01)'
> INSERT INTO products
> ( id
> , description
> )
> VALUES
> ( 777
> , 'blue ball'
> );
>
> -- Select history of products with id=777
> VALIDTIME
> SELECT *
> FROM product
> WHERE id=777;
>
> id | description | valid_period
> ----------------------------------------------
> 777| red ball | [2006-01-01 - 2007-01-01)
> 777| blue ball | [2007-01-01 - 2008-01-01)
>
> -- Select current products with id=777
> -- The date when query was run was 2007-02-10.
> SELECT *
> FROM products
> WHERE id=777;
>
> id | description
> ------------------
> 777| blue ball
>
> There are many more details in chapter 12 of [1].
>
>
> d) New stuff (dependencies, indices, syntax, libraries)
>
> One of the base level additions is the PERIOD datatype. I think that
> implementing temporal support is reliant on developing such a type. The
> description of this datatype is laid out in chapter 4 of [1]. The SQL syntax
> is
> present in chapter 12 of [1]. I see this as the first piece that needs to be
> implemented in order to take steps toward a DBMS to supports full temporal
> capabilities. I think that PERIOD can largely reuse the datatime functionality
> for parsing of literals and for comparisons. The RTREE seems to nicely
> incorporate needed indexing of the PERIOD type. The syntax of the parser will
> have to be extended to handle the PERIOD literals and constructor. I believe
> any
> additional libraries will be required.
>
> There are also extensions to the syntax of table creation, table altering,
> querying, inserting, and updating on temporal tables. These are all discussed
> in
> some detail in chapter 12 of [1]. I don't think that any of these changes will
> require new libraries.
>
> The semantics of temporal tables and querying them could have a dramatic
> affect
> on how things like primary keys and unique constraints work. I would like to
> get
> some comments about this from the community.
>
>
> e) See Also
>
> Addtional resources can be found at Dr. Richard T. Snodgrass's website at [3],
> including SQL valid-time table support spec at [4] and SQL transaction-time
> table support spec at [5].
>
> Thoughts? Questions? Comments?
>
> [1]http://www.cs.arizona.edu/~rts/tdbbook.pdf
> [2]http://www.cs.arizona.edu/~rts/pp30-31.pdf
> [3]http://www.cs.arizone.edu/~rts/
> [4]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad146.pdf
> [5]ftp://ftp.cs.arizona.edu/tsql/tsql2/sql3/mad147.pdf
>
> Thanks,
> wt
> --
> Warren Turkal (w00t)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>


--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 07:07 AM
Alvaro Herrera
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

Jim C. Nasby wrote:
> My suggestion would be to focus on a period data type first and
> foremost, as that's something that could be readily used by a lot of
> folks. Of particular note, it's difficult to query tables that have
> start_time and end_time fields to define a period; it's easy to screw up
> the boundary conditions, and it's also hard to make those queries
> perform well without going to extra lengths (such as defining a 'bogus'
> GiST index on something like box(point(start,start),point(end,end)). And
> it's not possible to do that in a way that avoids floating points and
> their errors.


FWIW there's already a type called tinterval that stores (start,end). I
don't think it's very much documented; maybe it can be extended or used
as base for a new, more complete and robust type, indexable in a more
natural way, etc etc.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 07:07 AM
Warren Turkal
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

On Fri, Feb 16, 2007 at 05:39:24PM -0300, Alvaro Herrera wrote:
> FWIW there's already a type called tinterval that stores (start,end). I
> don't think it's very much documented; maybe it can be extended or used
> as base for a new, more complete and robust type, indexable in a more
> natural way, etc etc.


The book I cited has a very complete description of the period data type
including details on what extensions to SQL are needed. I am very
interested in starting a robust implementation of the period datatype.

I think the datetime infrastructure will already do most of the needed
parsing and packing of the hard parts of the period datatype (namely the
date and time formats). I will investigate the tinterval to see if it
meets the needs of the PERIOD datatypes.

I agree with focusing on the PERIOD datatype. I think that is a major
part of the foundation for temporal extensions and would have to be
implemented first. Therefore, I present the following plan for getting
there.

1) Focus first on PERIOD(DATE) to keep things as simple as possible.
2) Implement a first cut on the period datatype that only handles
storing two dates. (Maybe tinterval will get us here for free?)
3) Add information to the datatype for open or closed interval for
beginning and ending sides of the period.

I could probably have this done in time for the freeze with some
mentoring. I could probably even start implementation of some indices
and operator function for the type. This functionality is what I expect
to have a shot of making an appearance in 8.3. It will be minimally
functional at this point.

The next project will be altering the parser to be able to construct and
operate on PERIOD types with the syntax extensions to SQL in Dr.
Snodgrass's book.

Once all of the syntax is implemented for PERIOD(DATE), the next project
will be to extend to support PERIOD(DATETIME WITH TIMEZONE). Again, I
think the datatime infrastructure will be very useful here.

wt

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 07:07 AM
Oleg Bartunov
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

On Fri, 16 Feb 2007, Alvaro Herrera wrote:

> Jim C. Nasby wrote:
>> My suggestion would be to focus on a period data type first and
>> foremost, as that's something that could be readily used by a lot of
>> folks. Of particular note, it's difficult to query tables that have
>> start_time and end_time fields to define a period; it's easy to screw up
>> the boundary conditions, and it's also hard to make those queries
>> perform well without going to extra lengths (such as defining a 'bogus'
>> GiST index on something like box(point(start,start),point(end,end)). And
>> it's not possible to do that in a way that avoids floating points and
>> their errors.

>
> FWIW there's already a type called tinterval that stores (start,end). I
> don't think it's very much documented; maybe it can be extended or used
> as base for a new, more complete and robust type, indexable in a more
> natural way, etc etc.


RI-Tree (Relational intervar tree)
http://www.dbs.informatik.uni-muench...ns/RI-Tree.pdf
looks promising for that purposes.

Regards,
Oleg
__________________________________________________ ___________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 07:07 AM
Hannu Krosing
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

Ühel kenal päeval, R, 2007-02-16 kell 17:39, kirjutas Alvaro Herrera:
> Jim C. Nasby wrote:
> > My suggestion would be to focus on a period data type first and
> > foremost, as that's something that could be readily used by a lot of
> > folks. Of particular note, it's difficult to query tables that have
> > start_time and end_time fields to define a period; it's easy to screw up
> > the boundary conditions, and it's also hard to make those queries
> > perform well without going to extra lengths (such as defining a 'bogus'
> > GiST index on something like box(point(start,start),point(end,end)). And
> > it's not possible to do that in a way that avoids floating points and
> > their errors.

>
> FWIW there's already a type called tinterval that stores (start,end). I
> don't think it's very much documented; maybe it can be extended or used
> as base for a new, more complete and robust type, indexable in a more
> natural way, etc etc.


How easy/hard would it be to create unique indexes on tinterval (unique
here meaning non-overlapping) ?

Is tinterval meant to be open/closed at start and end ?

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 07:08 AM
Tom Lane
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

Hannu Krosing <hannu@skype.net> writes:
> How easy/hard would it be to create unique indexes on tinterval (unique
> here meaning non-overlapping) ?


"Overlapping" is not an equality relation (it fails the transitive law),
so I'm not entirely sure what "unique" means in this context ... but I
can promise you you can't make it work with btree.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 07:08 AM
Warren Turkal
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

On Saturday 17 February 2007 01:50, Hannu Krosing wrote:
> Is tinterval meant to be open/closed at start and end ?


I don't see the tinterval doing anything other than storing two times.

wt
--
Warren Turkal (w00t)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 07:08 AM
Warren Turkal
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

On Saturday 17 February 2007 09:26, Tom Lane wrote:
> "Overlapping" is not an equality relation (it fails the transitive law),
> so I'm not entirely sure what "unique" means in this context ... but I
> can promise you you can't make it work with btree.


There is an equality relation on periods. But it wouldn't really tell you much
useful info, as it's not normally what you're looking for with time.

wt
--
Warren Turkal (w00t)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 07:08 AM
Martijn van Oosterhout
 
Posts: n/a
Default Re: RFC: Temporal Extensions for PostgreSQL

On Sat, Feb 17, 2007 at 11:40:44AM -0700, Warren Turkal wrote:
> On Saturday 17 February 2007 09:26, Tom Lane wrote:
> > "Overlapping" is not an equality relation (it fails the transitive law),
> > so I'm not entirely sure what "unique" means in this context ... but I
> > can promise you you can't make it work with btree.

>
> There is an equality relation on periods. But it wouldn't really tell youmuch
> useful info, as it's not normally what you're looking for with time.


What he's referring to is that "overlaps" is not transitive. i.e. if A
overlaps B and B overlaps C then A doesn't necessarily overlap C.

However, non-overlapping intervals are stricly ordered, so if you
reject overlaps from the index then new intervals can each only be
inserted into one place. However, the locking required is probably
non-trivial.

Get unique indexes for GiST working and you're home...

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFF12MiIB7bNG8LQkwRAuiEAJ9+ZdQnqwyCVIyLpsq0ZS 9UCZIaUACfUskp
/xLzgv9ddQ1DFpUoT/QQaFA=
=vtnP
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:39 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com