Unix Technical Forum

Re: [COMMITTERS] pgsql: Update: < * Allow adding enumeratedvalues to an existing

This is a discussion on Re: [COMMITTERS] pgsql: Update: < * Allow adding enumeratedvalues to an existing within the pgsql Hackers forums, part of the PostgreSQL category; --> "Tom Dunstan" <pgsql@tomd.cc> writes: > One scenario I'm not happy about is this: the friendly db admin has > ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

 

LinkBack Thread Tools Display Modes
  #11 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Lane
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

"Tom Dunstan" <pgsql@tomd.cc> writes:
> One scenario I'm not happy about is this: the friendly db admin has
> happily added an extra value to the end before and the operation has
> been a snap - no rewriting required. But this time either a) oid
> wraparound has occurred, b) she's inserted one or c) she's reordered
> them. Bam - we start rewriting the entire database. That's not the
> kind of surprise I like giving people, and the current situation of
> either don't allow updates at all, or the alternative to surprises of
> always rewrite everything seem pretty deficient. And I don't want to
> only allow updates if they won't cause a rewrite, it's
> nondeterministic.


If we take OIDs out of the picture it wouldn't be nondeterministic.

I think with something like your 16bit/16bit design, and say ten free
codes between each original assignment, it'd be okay to not support the
rewriting stuff at all. The frequency with which people would hit the
restriction would be so low it wouldn't be worth supporting the code for
it, especially since we couldn't do it any more efficiently than a
manual ALTER COLUMN TYPE replacement would.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 04-29-2008, 08:30 PM
Brendan Jurd
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 26, 2008 at 6:33 AM, Tom Dunstan wrote:
> One scenario I'm not happy about is this: the friendly db admin has
> happily added an extra value to the end before and the operation has
> been a snap - no rewriting required. But this time either a) oid
> wraparound has occurred, b) she's inserted one or c) she's reordered
> them. Bam - we start rewriting the entire database.


As long as the documentation is candid about this, I don't think it's
a show-stopper. e.g.:

N.B. Rearranging an ENUM will usually be a simple operation, but
in $CERTAIN_CASES may require a rewrite of tables using the ENUM,
which is time consuming and locks the table against writing ...

You'd probably also want a "NOTICE: Change to ENUM will require
rewriting of tables." to be emitted when this happens.

>
> I've already suggested some alternatives in the reply to Brendan that
> would solve some of this, but I suppose another gross-seeming way to
> stop surprise rewrites would be to never do one unless given a FORCE
> REWRITE clause on the ALTER statement or something like that, and fail
> if a rewrite is required not specified.
>


That would be okay too, but I think I'd prefer proceeding with the
rewrite after emitting a NOTICE. If the db admin decides not to go
ahead, or wait to do it after hours, she can always hit ^C, right?

Cheers,
BJ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFIEkO+5YBsbHkuyV0RAttIAJ9TNhNDN8SAsfyAR5MY9l ppPyeWSQCfYOSs
kG25F0V44QqTZ4HMAWXL5JI=
=tG5q
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Dunstan
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

On Sat, Apr 26, 2008 at 2:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > The very first idea that I had was to have an enum value as
> > the combination of both an enum id and the ordinal value.

>
> I seem to remember that we discussed that and rejected it, but I don't
> remember the reasoning...


I don't think there was any terribly strong objection. IIRC I
originally proposed trying to fit everything into 2 bytes, you
objected to that as "unnecessary bit-shaving" and proposed 8 bytes, I
didn't want to give up pass-by-value, plus my initial pg_enum design
was rather denormalized - the current solution was a compromise that
fixed that and kept everyone happy. But we didn't really consider
updates too carefully. Maybe it was just a bit too cute a solution.

So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":

1 - We space the values out as evenly as we can across the 65000ish
range and allow people to delete, insert and append, but not reorder.
If they do the above gratuitously we might have to do a rewrite, but
they'll have to get fairly busy to do it. Rewrite would be required
for reorderings.

2- We totally give up the idea of storing a value on disk that is
directly comparable (other than equality), and simply number from zero
up, using that number to index into an array (or use as syscache key
or whatever) containing the real ordering information. We can then
reorder or do any other operations to our heart's content.

I'm actually favouring option 2 - I think it can be done in such a way
as to not be much of an overhead compared to the status quo, and you
know that if we don't implement proper reordering now, someone will
ask for it, and we'll be having this discussion at a similar time
after 8.4 goes out.

I'm happy to work on a patch for this if it meets general approval.

Cheers

Tom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Dunstan
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

Oops, sorry for the crossed emails, slight delay in my main being received.

On Sat, Apr 26, 2008 at 2:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think with something like your 16bit/16bit design, and say ten free
> codes between each original assignment, it'd be okay to not support the
> rewriting stuff at all.


Yeah. I'm more concerned about someone wanting reordering, hence the
second option in my other mail.

Cheers

Tom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Lane
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

"Tom Dunstan" <pgsql@tomd.cc> writes:
> 1 - We space the values out as evenly as we can across the 65000ish
> range and allow people to delete, insert and append, but not reorder.
> If they do the above gratuitously we might have to do a rewrite, but
> they'll have to get fairly busy to do it. Rewrite would be required
> for reorderings.


> 2- We totally give up the idea of storing a value on disk that is
> directly comparable (other than equality), and simply number from zero
> up, using that number to index into an array (or use as syscache key
> or whatever) containing the real ordering information. We can then
> reorder or do any other operations to our heart's content.


> I'm actually favouring option 2 -


I'm not ... it strikes me that it will add implementation complexity and
runtime overhead for a feature that two days ago we didn't think we
needed at all, and IMHO one we still shouldn't be thinking to expend a
lot of work on.

I like #1 with no rewrite support. That strikes me as covering 99%
of the requirements with 10% of the work.

Further, as already noted, if you do have to rewrite then a series of
manual ALTER COLUMN TYPE operations would probably be a *better* answer
than a monolithic implementation, because of the locking problems
involved in doing it in one transaction. (Oh, and don't forget the disk
space problem: double the disk space for every table involved,
simultaneously.)

regards, tom lane

PS: no, I do *not* want to hear any proposals for ALTER TYPE
CONCURRENTLY ;-)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #16 (permalink)  
Old 04-29-2008, 08:30 PM
Andrew Dunstan
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumeratedvalues to an existing



Tom Dunstan wrote:
> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>
> 1 - We space the values out as evenly as we can across the 65000ish
> range and allow people to delete, insert and append, but not reorder.
> If they do the above gratuitously we might have to do a rewrite, but
> they'll have to get fairly busy to do it. Rewrite would be required
> for reorderings.
>


Or else we just error out in such cases. As Tom Lane suggests, rewriting
has some nasty deadlock possibilities.

You always have the option of creating a new enum type and moving each
affected column to that type.

> 2- We totally give up the idea of storing a value on disk that is
> directly comparable (other than equality), and simply number from zero
> up, using that number to index into an array (or use as syscache key
> or whatever) containing the real ordering information. We can then
> reorder or do any other operations to our heart's content.
>
> I'm actually favouring option 2 - I think it can be done in such a way
> as to not be much of an overhead compared to the status quo, and you
> know that if we don't implement proper reordering now, someone will
> ask for it, and we'll be having this discussion at a similar time
> after 8.4 goes out.
>
>
>


Being able simply to order by the oid value is fast. That's one of the
current benefits. So I think we'd need some benchmarking to show that
this wouldn't slow things down.

cheers


andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #17 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Dunstan
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

On Sat, Apr 26, 2008 at 2:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not ... it strikes me that it will add implementation complexity and
> runtime overhead for a feature that two days ago we didn't think we
> needed at all, and IMHO one we still shouldn't be thinking to expend a
> lot of work on.


Well, I *did* think it was necessary, I just hadn't spent the effort
in coming up with a solution. And on the effort side, I'm not going to
be hacking the optimizer any day soon.

> I like #1 with no rewrite support. That strikes me as covering 99%
> of the requirements with 10% of the work.


> Further, as already noted, if you do have to rewrite then a series of
> manual ALTER COLUMN TYPE operations would probably be a *better* answer
> than a monolithic implementation, because of the locking problems
> involved in doing it in one transaction.


I don't understand this if it's calling option 2 the monolithic
implementation. I was intending that the values be permanent tokens if
you like, so that ZERO rewriting would be required for any types of
modification. So I don't see where locking comes in. I don't want
rewriting either.

Hmm, maybe I haven't explained it properly. Here's an example:

CREATE TYPE thing AS ENUM('vegetable', 'animal');

Hypothetical new pg_enum:
enum_id | value | order | label
---------+-------+-------+-----------
0 | 0 | 0 | vegetable
0 | 1 | 1 | animal


ALTER TYPE thing AS ENUM('animal', 'mineral', 'vegetable');

pg_enum:
enum_id | value | order | label
---------+-------+-------+-----------
0 | 0 | 2 | vegetable
0 | 1 | 0 | animal
0 | 2 | 1 | mineral

So we store the 'value' column on disk, and it never changes. The
downside is that we now have to look up the order when we call a
non-equality operator, but I reckon we can pretty efficiently cache
that, so the lookup is just a couple of array index operations. The
upside is that we can reorder, and we'll never run out of values
"in-between" existing ones.

Anyway, sorry if all of the above *was* clear and I just misunderstood
the comment. If there's consensus to go with option 1 I'll pursue that
path. It's much less of a change to go from option 1 to option 2 than
the current code to either of them anyway, so doing some benchmarking
of both options shouldn't be hard if I want to.

Cheers

Tom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #18 (permalink)  
Old 04-29-2008, 08:30 PM
Tom Lane
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumerated values to an existing

"Tom Dunstan" <pgsql@tomd.cc> writes:
> On Sat, Apr 26, 2008 at 2:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Further, as already noted, if you do have to rewrite then a series of
>> manual ALTER COLUMN TYPE operations would probably be a *better* answer
>> than a monolithic implementation, because of the locking problems
>> involved in doing it in one transaction.


> I don't understand this if it's calling option 2 the monolithic
> implementation.


No, I was imagining an option-1 implementation trying to support
rewriting of all affected tables in a single "monolithic" command.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #19 (permalink)  
Old 04-29-2008, 08:30 PM
Alvaro Herrera
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow addingenumerated values to an existing

Andrew Dunstan escribió:

> Tom Dunstan wrote:
>> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>>
>> 1 - We space the values out as evenly as we can across the 65000ish
>> range and allow people to delete, insert and append, but not reorder.
>> If they do the above gratuitously we might have to do a rewrite, but
>> they'll have to get fairly busy to do it. Rewrite would be required
>> for reorderings.

>
> Or else we just error out in such cases. As Tom Lane suggests, rewriting
> has some nasty deadlock possibilities.
>
> You always have the option of creating a new enum type and moving each
> affected column to that type.


Another alternative would be internally creating a different temporary
enum, rewriting the tables one by one each on its own transaction, and
finish by dropping the original enum and renaming the temporary one.
This solves the deadlock problem.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #20 (permalink)  
Old 04-29-2008, 08:30 PM
Andrew Dunstan
 
Posts: n/a
Default Re: Re: [COMMITTERS] pgsql: Update: < * Allow adding enumeratedvalues to an existing



Alvaro Herrera wrote:
> Andrew Dunstan escribió:
>
>
>> Tom Dunstan wrote:
>>
>>> So two alternative proposals, both with a 2 byte "enum id" and a 2 byte "value":
>>>
>>> 1 - We space the values out as evenly as we can across the 65000ish
>>> range and allow people to delete, insert and append, but not reorder.
>>> If they do the above gratuitously we might have to do a rewrite, but
>>> they'll have to get fairly busy to do it. Rewrite would be required
>>> for reorderings.
>>>

>> Or else we just error out in such cases. As Tom Lane suggests, rewriting
>> has some nasty deadlock possibilities.
>>
>> You always have the option of creating a new enum type and moving each
>> affected column to that type.
>>

>
> Another alternative would be internally creating a different temporary
> enum, rewriting the tables one by one each on its own transaction, and
> finish by dropping the original enum and renaming the temporary one.
> This solves the deadlock problem.
>
>


What happens when someone tries to join two of the tables, one that has
been converted and one that hasn't? You might not have deadlock, but you
won't have type integrity either, ISTM.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:33 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com