Unix Technical Forum

Rethinking user-defined-typmod before it's too late

This is a discussion on Rethinking user-defined-typmod before it's too late within the pgsql Hackers forums, part of the PostgreSQL category; --> The current discussion about the tsearch-in-core patch has convinced me that there are plausible use-cases for typmod values that ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 09:08 AM
Tom Lane
 
Posts: n/a
Default Rethinking user-defined-typmod before it's too late

The current discussion about the tsearch-in-core patch has convinced me
that there are plausible use-cases for typmod values that aren't simple
integers. For instance it could be sane for a type to want a locale or
language selection as a typmod, eg tsvector('ru') or tsvector('sv').
(I'm not saying we are actually going to do that to tsvector, just that
it's now clear to me that there are use-cases for such things.)

Teodor's work a few months ago generalized things enough so that
something like this is within reach. The grammar will actually allow
darn near anything for a typmod, since the grammar production is
expr_list to avoid shift/reduce conflict with the very similar-looking
productions for function calls. The only place where we are
constraining what a typmod can be is that the defined API for
user-written typmodin functions is "integer array".

At the time that patch was being worked on, I think I argued that
integer typmods were enough because you'd have to pack them into such a
small output representation anyway. The hole in that logic is that you
might have a fairly small enumerated set of possibilities, but that
doesn't mean you want to make the user use a numeric code for them.
You could even make the typmod be an integer key for a lookup table,
if the set of possibilities is not hardwired.

Since this code hasn't been released yet, the API isn't set in stone
.... but as soon as we ship 8.3, it will be, or at least changing it will
be orders of magnitude more painful than it is today. So, late as this
is in the devel cycle, I think now is the time to reconsider.

I propose changing the typmodin signature to "typmodin(cstring[]) returns
int4", that is, the typmods will be passed as strings not integers. This
will incur a bit of extra conversion overhead for the normal uses where
the typmods are integers, but I think the gain in flexibility is worth
it. I'm inclined to make the code in parse_type.c take either integer
constants, simple string literals, or unqualified names as input ---
so you could write either tsvector('ru') or tsvector(ru) when using a
type that wants a nonintegral typmod.

Note that the typmodout side is already OK since it is defined to return
a string.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 09:08 AM
David Fetter
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

On Fri, Jun 15, 2007 at 12:14:45PM -0400, Tom Lane wrote:

[snip]

> I propose changing the typmodin signature to "typmodin(cstring[])
> returns int4", that is, the typmods will be passed as strings not
> integers. This will incur a bit of extra conversion overhead for
> the normal uses where the typmods are integers, but I think the gain
> in flexibility is worth it. I'm inclined to make the code in
> parse_type.c take either integer constants, simple string literals,
> or unqualified names as input --- so you could write either
> tsvector('ru') or tsvector(ru) when using a type that wants a
> nonintegral typmod.
>
> Note that the typmodout side is already OK since it is defined to
> return a string.
>
> Comments?


+1

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 09:08 AM
Teodor Sigaev
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

> I propose changing the typmodin signature to "typmodin(cstring[]) returns
> int4", that is, the typmods will be passed as strings not integers. This
> will incur a bit of extra conversion overhead for the normal uses where
> the typmods are integers, but I think the gain in flexibility is worth

agree

> it. I'm inclined to make the code in parse_type.c take either integer


And modify ArrayGetTypmods() to ArrayGetIntegerTypmods()


Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 09:08 AM
Stephen Frost
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I propose changing the typmodin signature to "typmodin(cstring[]) returns
> int4", that is, the typmods will be passed as strings not integers. This
> will incur a bit of extra conversion overhead for the normal uses where
> the typmods are integers, but I think the gain in flexibility is worth
> it. I'm inclined to make the code in parse_type.c take either integer
> constants, simple string literals, or unqualified names as input ---
> so you could write either tsvector('ru') or tsvector(ru) when using a
> type that wants a nonintegral typmod.
>

Would this allow for 'multi-value' typmods for user-defined types?
That's something that would greatly help and simplify PostGIS. It was
brought up on the PostGIS lists here:
http://postgis.refractions.net/piper...er/013086.html
and on -hackers here:
http://www.mail-archive.com/pgsql-ha.../msg81281.html

The 'geometry' type really needs to have a typmod which has the
dimensions, SRID and type of the geometry. At the moment the PostGIS
folks are using constraints and essentially a side-table to work around
this, which gets really, really ugly. It sounds like this might work
for them, and while it'd incur a bit of overhead to parse the string I'm
pretty sure it'd be worth it.

Thanks,

Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGcsV4rzgMPqB3kigRAlmhAJ9CIslZasWGjCrNUUqRKD +stdsGFwCfVFt8
3hsCSN/htH9VxQSCubheomA=
=e2TG
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 09:08 AM
Tom Lane
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

Teodor Sigaev <teodor@sigaev.ru> writes:
>> I propose changing the typmodin signature to "typmodin(cstring[]) returns
>> int4", that is, the typmods will be passed as strings not integers.


> And modify ArrayGetTypmods() to ArrayGetIntegerTypmods()


Right --- the decoding work will only have to happen in one place for
our existing uses.

Is it worth providing an ArrayGetStringTypmods in core, when it won't
be used by any existing core datatypes?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 09:08 AM
Tom Lane
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

Stephen Frost <sfrost@snowman.net> writes:
> Would this allow for 'multi-value' typmods for user-defined types?


If you can squeeze them into 31 bits of stored typmod, yes. That
may mean that you still need the side table (with stored typmod being a
lookup key for the table). But this gets you out of exposing that
detail to users.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 09:08 AM
Teodor Sigaev
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

> Is it worth providing an ArrayGetStringTypmods in core, when it won't
> be used by any existing core datatypes?

I don't think so - cstring[] is a set of strings itself. I don't believe that we
could suggest something commonly useful without some real-world examples.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 09:08 AM
Stephen Frost
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Would this allow for 'multi-value' typmods for user-defined types?

>
> If you can squeeze them into 31 bits of stored typmod, yes. That
> may mean that you still need the side table (with stored typmod being a
> lookup key for the table). But this gets you out of exposing that
> detail to users.


I see, the user could put in:
geometry(123456789,MULTIPOLYGON,3);

But we'd only get 31 bits of room to encode that into. I'm not sure if
that's enough. At the moment there's three columns we're talking
about in the side-table:
SRID (integer)
TYPE (varchar(30))
DIMENSIONS (integer)

Now, the type is a small enumerated set, and we can probably limit
dimensions to a few bits (maybe one for 2d/3d, but we might have some
other cases...), and still be following the OGC standard, but I don't
think there are any restrictions on SRID beyond '32 bit integer'. As
such, I'm not sure if we can encode it all directly into 31 bits (which
would obviously be preferred to a side-table with each case we come
across being enumerated in it). Then again, at the *moment*, anyway,
the SRIDs we have only go up to about 32,000, so we could dedicate 16
bits to it and probably be alright.

Any chance of this being increased? Obviously would like to avoid the
side-table, if possible.

Thanks!

Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGcuIgrzgMPqB3kigRAlPIAJ99jaKqp99ggopUtaQ3iP m3luqC9gCePgoD
sta9v9thZXs93C0urXOstNE=
=wp7b
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 09:08 AM
Peter Eisentraut
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

Am Freitag, 15. Juni 2007 18:14 schrieb Tom Lane:
> The current discussion about the tsearch-in-core patch has convinced me
> that there are plausible use-cases for typmod values that aren't simple
> integers. *For instance it could be sane for a type to want a locale or
> language selection as a typmod, eg tsvector('ru') or tsvector('sv').


That would also be very useful for the XML type with an optional XML schema
modification. I guess in a lot of use cases you would have to store the
mapping in a side table, if the typmod on disk remains an integer.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 09:08 AM
Tom Lane
 
Posts: n/a
Default Re: Rethinking user-defined-typmod before it's too late

Stephen Frost <sfrost@snowman.net> writes:
> Any chance of this being increased?


No. Changing typmod to something other than int32 would require many
thousands of lines of diffs just in the core distro. I don't even want
to think about how much outside code would break.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:40 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com