Unix Technical Forum

Re: [PATCHES] Enums patch v2

This is a discussion on Re: [PATCHES] Enums patch v2 within the pgsql Hackers forums, part of the PostgreSQL category; --> Heikki Linnakangas <heikki@enterprisedb.com> writes: > Ignoring my general dislike of enums, I have a few issues with the patch ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 07:02 AM
Tom Lane
 
Posts: n/a
Default Re: [PATCHES] Enums patch v2

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Ignoring my general dislike of enums, I have a few issues with the patch
> as it is:


> 1. What's the point of having comparison operators for enums? For most
> use cases, there's no natural ordering of enum values.


If you would like to be able to index enum columns, or even GROUP BY one,
you need those; whether the ordering is arbitrary or not is irrelevant.

> 2. The comparison routine compares oids, right? If the oids wrap around
> when the enum values are created, the ordering isn't what the user expects.


This is a fair point --- it'd be better if the ordering were not
dependent on chance OID assignments. Not sure what we are willing
to pay to have that though.

> 3. 4 bytes per value is wasteful if you're storing simple status codes
> etc.


I've forgotten exactly which design Tom is proposing to implement here,
but at least one of the contenders involved storing an OID that would be
unique across all enum types. 1 byte is certainly not enough for that
and even 2 bytes would be pretty marginal. I'm unconvinced by arguments
about 2 bytes being so much better than 4 anyway --- in the majority of
real table layouts, the hoped-for savings would disappear into alignment
padding.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 07:02 AM
Tom Lane
 
Posts: n/a
Default Re: [PATCHES] Enums patch v2

Alvaro Herrera <alvherre@commandprompt.com> writes:
> I don't, because there are always those that are knowledgeable enough to
> know how to reduce space lost to padding. So it would be nice to have
> 2-byte enums on-disk, and resolve them based on the column's typid. But
> then, I'm not familiar with the patch at all so I'm not sure if it's
> possible.


Remember that the value has to be decodable by the output routine.
So the only way we could do that would be by creating a separate output
function for each enum type. (That is, a separate pg_proc entry
.... they could all point at the same C function, which would have to
check which OID it was called as and work backward to determine the enum
type.)

While this is doubtless doable, it's slow, it bloats pg_proc, and
frankly no argument has been offered that's compelling enough to
require it. The alignment issue takes enough air out of the
space-saving argument that it doesn't seem sufficient to me.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 07:03 AM
Andrew Dunstan
 
Posts: n/a
Default column ordering, was Re: [PATCHES] Enums patch v2

Gregory Stark wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
>
>
>> I'm not a big fan of ordering columns to optimise record layout, except in the
>> most extreme cases (massive DW type apps). I think visible column order should
>> be logical, not governed by physical considerations.
>>

>
> Well as long as we're talking "should"s the database should take care of this
> for you anyways.
>
>


Sure, but the only sane way I can think of to do that would be have
separate logical and physical orderings, with a map between the two. I
guess we'd need to see what the potential space savings would be and
establish what the processing overhead would be, before considering it.
One side advantage would be that it would allow us to do the often
requested "add column at position x".

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 07:03 AM
Martijn van Oosterhout
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

On Tue, Dec 19, 2006 at 10:48:41AM -0500, Andrew Dunstan wrote:
> Sure, but the only sane way I can think of to do that would be have
> separate logical and physical orderings, with a map between the two. I
> guess we'd need to see what the potential space savings would be and
> establish what the processing overhead would be, before considering it.
> One side advantage would be that it would allow us to do the often
> requested "add column at position x".


A patch to allow seperate physical and logical orderings was submitted
and rejected. Unless something has changed on that front, any
discussion in this direction isn't really useful.

Once this is possible it would allow a lot of simple savings. For
example, shifting all fixed width fields to the front means they can
all be accessed without looping through the previous columns, for
example.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFiBJgIB7bNG8LQkwRAj18AJwL6WHWR2f1bTtVpESMv7 xCgeWLvgCfVjfp
aLQIw3JfB5dWcr50vOKpaAk=
=FPWA
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 07:04 AM
Robert Treat
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

On Tuesday 19 December 2006 11:25, Martijn van Oosterhout wrote:
> On Tue, Dec 19, 2006 at 10:48:41AM -0500, Andrew Dunstan wrote:
> > Sure, but the only sane way I can think of to do that would be have
> > separate logical and physical orderings, with a map between the two. I
> > guess we'd need to see what the potential space savings would be and
> > establish what the processing overhead would be, before considering it.
> > One side advantage would be that it would allow us to do the often
> > requested "add column at position x".

>
> A patch to allow seperate physical and logical orderings was submitted
> and rejected. Unless something has changed on that front, any
> discussion in this direction isn't really useful.
>


The patch was rejected on technical means, and the author decided it was too
much work to finish it. If someone wanted to try and complete that work I
don't think anyone would stand against it.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 07:04 AM
Tom Lane
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

Robert Treat <xzilla@users.sourceforge.net> writes:
> On Tuesday 19 December 2006 11:25, Martijn van Oosterhout wrote:
>> A patch to allow seperate physical and logical orderings was submitted
>> and rejected. Unless something has changed on that front, any
>> discussion in this direction isn't really useful.


> The patch was rejected on technical means, and the author decided it was too
> much work to finish it. If someone wanted to try and complete that work I
> don't think anyone would stand against it.


Apparently you don't remember the discussion. The fundamental objection
to it was that it would create a never-ending source of bugs, ie, using
the logical column number where the physical number was required or vice
versa. Even assuming that we could eliminate all such bugs in the code
base at any instant, what would prevent introduction of another such bug
in every patch? Most ordinary test cases would fail to expose the
difference.

If you can show me a reasonably bulletproof or machine-checkable way to
keep the two kinds of column numbers distinct, I'd be all for it. But
without that, the answer will remain no.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 07:04 AM
Stephen Frost
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> If you can show me a reasonably bulletproof or machine-checkable way to
> keep the two kinds of column numbers distinct, I'd be all for it. But
> without that, the answer will remain no.


Force references to go through macros which implement the lookup for the
appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic. I guess my
feeling on how this would be approached would be that there'd simply be
a level where logical columns are used and a seperate level where
physical columns are used. Perhaps the storage layer isn't well enough
abstracted for that though. Another possibility would be to declare
seperate structures for them (or do something else along those lines,
aka, whatever it is the Linux kernel does) and get the compiler to whine
whenever the typing isn't followed correctly.

Just tossing some thoughts out there, I'd *really* like to have
movable-columns and the ability to add columns in where they're most
appropriate instead of off on the end... If we can settle on an
approach to deal with Tom's concern I'd be willing to look at updating
the patch to implement it though it's not really high enough that I can
promise anything.

Thanks,

Stephen

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFiLhCrzgMPqB3kigRAmnDAJ4yaeJBt/e51fiDgwNtCN33jKJAxQCfUjmE
iQfSU7L+5JGca1HA1fhKBOg=
=wsiD
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 07:04 AM
Tom Lane
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

Stephen Frost <sfrost@snowman.net> writes:
> Force references to go through macros which implement the lookup for the
> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.


It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it. (And no, that's not a vote to move to C++ ...)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 07:05 AM
Russell Smith
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>
>> Force references to go through macros which implement the lookup for the
>> appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
>> PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.
>>

>
> It doesn't really address the question of how you know which one to
> use at any particular line of code; or even more to the point, what
> mechanism will warn you if you use the wrong one.
>
> My gut feeling about this is that we could probably enforce such a
> distinction if we were using C++, but while coding in C I have no
> confidence in it. (And no, that's not a vote to move to C++ ...)
>

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an
existing column will require the entire table to be rewritten. This may
take a significant amount of time for a large table; and it will
temporarily require double the disk space.


Now, we are rewriting the table from scratch anyway, the on disk format
is changing. What is stopping us from switching the column order at the
same time. The only thing I can think is that the catalogs will need
more work to update them. It's a middle sized price to pay for being
able to reorder the columns in the table. One of the problems I have is
wanting to add a column in the middle of the table, but FK constraints
stop me dropping the table to do the reorder. If ALTER TABLE would let
me stick it in the middle and rewrite the table on disk, I wouldn't
care. It's likely that I would be rewriting the table anyway. And by
specifying AT POSITION, or BEFORE/AFTER you know for big tables it's
going to take a while.

Not that I'm able to code this at all, but I'm interested in feedback on
this option.

Regards

Russell Smith
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>
>
>



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 07:05 AM
Martijn van Oosterhout
 
Posts: n/a
Default Re: column ordering, was Re: [PATCHES] Enums patch v2

On Tue, Dec 19, 2006 at 11:29:24PM -0500, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Force references to go through macros which implement the lookup for the
> > appropriate type? ie: LOGICAL_COL(table_oid,2) vs.
> > PHYSICAL_COL(table_oid,1) Perhaps that's too simplistic.

>
> It doesn't really address the question of how you know which one to
> use at any particular line of code; or even more to the point, what
> mechanism will warn you if you use the wrong one.


There's one method: Set it up so that when you create a table, it
randomizes the order of the fields on disk. Obviously for production
this isn't smart, but it would test the code a lot. Though in the
regression tests many tables only have one column so they won't be
affected.

If we had unit tests you could create a function called
"heap_mangle_tuple" which simply does physical reordering but logically
does nothing and feed it in at each point to check the code is
invarient.

Another approach is to number logical columns starting at 1000. This
would mean that at a glance you could tell what you're talking about.
And code using the wrong one will do something obviously bad. If
performance is an issue you could only enable the offset for
--enable-assert builds.

Personally I like this approach because it would "encourage" everyone
to use the macro to access the fields, since not doing so will place a
constant in an obvious place. It's also trivial for the system to
check.

Personally I'm unsure of the scope of the problem. AFAICS there's
hardly anywhere that would use physical offsets...

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFiSdeIB7bNG8LQkwRAsxTAJ4xMIyHEKUICRfbmJKbkP cv2TLpdgCbBolh
bsDOIqUnnF7zpvFv2wXFrpk=
=tW6U
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:11 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com