This is a discussion on Rethinking user-defined-typmod before it's too late within the pgsql Hackers forums, part of the PostgreSQL category; --> The current discussion about the tsearch-in-core patch has convinced me that there are plausible use-cases for typmod values that ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| The current discussion about the tsearch-in-core patch has convinced me that there are plausible use-cases for typmod values that aren't simple integers. For instance it could be sane for a type to want a locale or language selection as a typmod, eg tsvector('ru') or tsvector('sv'). (I'm not saying we are actually going to do that to tsvector, just that it's now clear to me that there are use-cases for such things.) Teodor's work a few months ago generalized things enough so that something like this is within reach. The grammar will actually allow darn near anything for a typmod, since the grammar production is expr_list to avoid shift/reduce conflict with the very similar-looking productions for function calls. The only place where we are constraining what a typmod can be is that the defined API for user-written typmodin functions is "integer array". At the time that patch was being worked on, I think I argued that integer typmods were enough because you'd have to pack them into such a small output representation anyway. The hole in that logic is that you might have a fairly small enumerated set of possibilities, but that doesn't mean you want to make the user use a numeric code for them. You could even make the typmod be an integer key for a lookup table, if the set of possibilities is not hardwired. Since this code hasn't been released yet, the API isn't set in stone .... but as soon as we ship 8.3, it will be, or at least changing it will be orders of magnitude more painful than it is today. So, late as this is in the devel cycle, I think now is the time to reconsider. I propose changing the typmodin signature to "typmodin(cstring[]) returns int4", that is, the typmods will be passed as strings not integers. This will incur a bit of extra conversion overhead for the normal uses where the typmods are integers, but I think the gain in flexibility is worth it. I'm inclined to make the code in parse_type.c take either integer constants, simple string literals, or unqualified names as input --- so you could write either tsvector('ru') or tsvector(ru) when using a type that wants a nonintegral typmod. Note that the typmodout side is already OK since it is defined to return a string. Comments? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Fri, Jun 15, 2007 at 12:14:45PM -0400, Tom Lane wrote: [snip] > I propose changing the typmodin signature to "typmodin(cstring[]) > returns int4", that is, the typmods will be passed as strings not > integers. This will incur a bit of extra conversion overhead for > the normal uses where the typmods are integers, but I think the gain > in flexibility is worth it. I'm inclined to make the code in > parse_type.c take either integer constants, simple string literals, > or unqualified names as input --- so you could write either > tsvector('ru') or tsvector(ru) when using a type that wants a > nonintegral typmod. > > Note that the typmodout side is already OK since it is defined to > return a string. > > Comments? +1 Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| > I propose changing the typmodin signature to "typmodin(cstring[]) returns > int4", that is, the typmods will be passed as strings not integers. This > will incur a bit of extra conversion overhead for the normal uses where > the typmods are integers, but I think the gain in flexibility is worth agree > it. I'm inclined to make the code in parse_type.c take either integer And modify ArrayGetTypmods() to ArrayGetIntegerTypmods() Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| * Tom Lane (tgl@sss.pgh.pa.us) wrote: > I propose changing the typmodin signature to "typmodin(cstring[]) returns > int4", that is, the typmods will be passed as strings not integers. This > will incur a bit of extra conversion overhead for the normal uses where > the typmods are integers, but I think the gain in flexibility is worth > it. I'm inclined to make the code in parse_type.c take either integer > constants, simple string literals, or unqualified names as input --- > so you could write either tsvector('ru') or tsvector(ru) when using a > type that wants a nonintegral typmod. > Would this allow for 'multi-value' typmods for user-defined types? That's something that would greatly help and simplify PostGIS. It was brought up on the PostGIS lists here: http://postgis.refractions.net/piper...er/013086.html and on -hackers here: http://www.mail-archive.com/pgsql-ha.../msg81281.html The 'geometry' type really needs to have a typmod which has the dimensions, SRID and type of the geometry. At the moment the PostGIS folks are using constraints and essentially a side-table to work around this, which gets really, really ugly. It sounds like this might work for them, and while it'd incur a bit of overhead to parse the string I'm pretty sure it'd be worth it. Thanks, Stephen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGcsV4rzgMPqB3kigRAlmhAJ9CIslZasWGjCrNUUqRKD +stdsGFwCfVFt8 3hsCSN/htH9VxQSCubheomA= =e2TG -----END PGP SIGNATURE----- |
| |||
| Teodor Sigaev <teodor@sigaev.ru> writes: >> I propose changing the typmodin signature to "typmodin(cstring[]) returns >> int4", that is, the typmods will be passed as strings not integers. > And modify ArrayGetTypmods() to ArrayGetIntegerTypmods() Right --- the decoding work will only have to happen in one place for our existing uses. Is it worth providing an ArrayGetStringTypmods in core, when it won't be used by any existing core datatypes? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Stephen Frost <sfrost@snowman.net> writes: > Would this allow for 'multi-value' typmods for user-defined types? If you can squeeze them into 31 bits of stored typmod, yes. That may mean that you still need the side table (with stored typmod being a lookup key for the table). But this gets you out of exposing that detail to users. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| > Is it worth providing an ArrayGetStringTypmods in core, when it won't > be used by any existing core datatypes? I don't think so - cstring[] is a set of strings itself. I don't believe that we could suggest something commonly useful without some real-world examples. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Would this allow for 'multi-value' typmods for user-defined types? > > If you can squeeze them into 31 bits of stored typmod, yes. That > may mean that you still need the side table (with stored typmod being a > lookup key for the table). But this gets you out of exposing that > detail to users. I see, the user could put in: geometry(123456789,MULTIPOLYGON,3); But we'd only get 31 bits of room to encode that into. I'm not sure if that's enough. about in the side-table: SRID (integer) TYPE (varchar(30)) DIMENSIONS (integer) Now, the type is a small enumerated set, and we can probably limit dimensions to a few bits (maybe one for 2d/3d, but we might have some other cases...), and still be following the OGC standard, but I don't think there are any restrictions on SRID beyond '32 bit integer'. As such, I'm not sure if we can encode it all directly into 31 bits (which would obviously be preferred to a side-table with each case we come across being enumerated in it). Then again, at the *moment*, anyway, the SRIDs we have only go up to about 32,000, so we could dedicate 16 bits to it and probably be alright. Any chance of this being increased? Obviously would like to avoid the side-table, if possible. Thanks! Stephen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGcuIgrzgMPqB3kigRAlPIAJ99jaKqp99ggopUtaQ3iP m3luqC9gCePgoD sta9v9thZXs93C0urXOstNE= =wp7b -----END PGP SIGNATURE----- |
| |||
| Am Freitag, 15. Juni 2007 18:14 schrieb Tom Lane: > The current discussion about the tsearch-in-core patch has convinced me > that there are plausible use-cases for typmod values that aren't simple > integers. *For instance it could be sane for a type to want a locale or > language selection as a typmod, eg tsvector('ru') or tsvector('sv'). That would also be very useful for the XML type with an optional XML schema modification. I guess in a lot of use cases you would have to store the mapping in a side table, if the typmod on disk remains an integer. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| Stephen Frost <sfrost@snowman.net> writes: > Any chance of this being increased? No. Changing typmod to something other than int32 would require many thousands of lines of diffs just in the core distro. I don't even want to think about how much outside code would break. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |