Unix Technical Forum

Re: [PERFORM] Hints proposal

This is a discussion on Re: [PERFORM] Hints proposal within the pgsql Hackers forums, part of the PostgreSQL category; --> [ This is off-topic for -performance, please continue the thread in -hackers ] "Jim C. Nasby" <jim@nasby.net> writes: > ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 06:16 AM
Tom Lane
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

[ This is off-topic for -performance, please continue the thread in
-hackers ]

"Jim C. Nasby" <jim@nasby.net> writes:
> These hints would outright force the planner to do things a certain way.
> ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */


This proposal seems to deliberately ignore every point that has been
made *against* doing things that way. It doesn't separate the hints
from the queries, it doesn't focus on fixing the statistical or cost
misestimates that are at the heart of the issue, and it takes no account
of the problem of hints being obsoleted by system improvements.

> It would also be useful to allow tweaking of planner cost estimates.
> This would take the general form of
> node operator value


This is at least focusing on the right sort of thing, although I still
find it completely misguided to be attaching hints like this to
individual queries.

What I would like to see is information *stored in a system catalog*
that affects the planner's cost estimates. As an example, the DBA might
know that a particular table is touched sufficiently often that it's
likely to remain RAM-resident, in which case reducing the page fetch
cost estimates for just that table would make sense. (BTW, this is
something the planner could in principle know, but we're unlikely to
do it anytime soon, for a number of reasons including a desire for plan
stability.) The other general category of thing I think we need is a
way to override selectivity estimates for particular forms of WHERE
clauses.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 06:16 AM
Merlin Moncure
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

On 10/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ This is off-topic for -performance, please continue the thread in
> -hackers ]


> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way. It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.


what about extending the domain system so that we can put in ranges
that override the statistics or (imo much more importantly) provide
information when the planner would have to restort to a guess. my case
for this is prepared statements with a parameterized limit clause.

prepare foo(l int) as select * from bar limit $1;

maybe:
create domain foo_lmt as int hint 1; -- probably needs to be fleshed out
prepare foo(l foolmt) as select * from bar limit $1;

this says: "if you have to guess me, please use this"

what I like about this over previous attempts to persuade you is the
grammar changes are localized and also imo future proofed. planner can
ignore the hints if they are not appropriate for the oparation.

merlin

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 06:16 AM
Jim C. Nasby
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

On Thu, Oct 12, 2006 at 11:42:32AM -0400, Tom Lane wrote:
> [ This is off-topic for -performance, please continue the thread in
> -hackers ]
>
> "Jim C. Nasby" <jim@nasby.net> writes:
> > These hints would outright force the planner to do things a certain way.
> > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */

>
> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way. It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.


Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See
also my comment below.

> > It would also be useful to allow tweaking of planner cost estimates.
> > This would take the general form of
> > node operator value

>
> This is at least focusing on the right sort of thing, although I still
> find it completely misguided to be attaching hints like this to
> individual queries.


Yes, but as I mentioned the idea here was to come up with something that
is (hopefully) easy to define and implement. In other words, something
that should be doable for 8.3. Because this proposal essentially amounts
to limiting plans the planner will consider and tweaking it's cost
estimates, I'm hoping that it should be (relatively) easy to implement.

> What I would like to see is information *stored in a system catalog*
> that affects the planner's cost estimates. As an example, the DBA might
> know that a particular table is touched sufficiently often that it's
> likely to remain RAM-resident, in which case reducing the page fetch
> cost estimates for just that table would make sense. (BTW, this is
> something the planner could in principle know, but we're unlikely to
> do it anytime soon, for a number of reasons including a desire for plan
> stability.)


All this stuff is great and I would love to see it! But this is all so
abstract that I'm doubtful this could make it into 8.4, let alone 8.3.
Especially if we want a comprehensive system that will handle most/all
cases. I don't know if we even have a list of all the cases we need to
handle.

> The other general category of thing I think we need is a
> way to override selectivity estimates for particular forms of WHERE
> clauses.


I hadn't thought about that for hints, but it would be a good addition.
I think the stats-tweaking model would work, but we'd probably want to
allow "=" as well (which could go into the other stats tweaking hints as
well).

.... WHERE a = b /* SELECTIVITY {+|-|*|/|=} value */
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 06:16 AM
Csaba Nagy
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

OK, I just have to comment...

"Jim C. Nasby" <jim@nasby.net> writes:
> > These hints would outright force the planner to do things a certain way.
> > ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */

>
> This proposal seems to deliberately ignore every point that has been
> made *against* doing things that way. It doesn't separate the hints
> from the queries, it doesn't focus on fixing the statistical or cost
> misestimates that are at the heart of the issue, and it takes no account
> of the problem of hints being obsoleted by system improvements.


But whatever arguments you made about planner improvements and the like,
it will NEVER be possible to correctly estimate in all cases the
statistics for a query, even if you perfectly know WHAT statistics you
need, which is also not the case all the time.

Tom, you're the one who knows best how the planner works... can you bet
anything you care about on the fact that one day the planner will never
ever generate a catastrophic plan without DBA tweaking ? And how far in
time we'll get to that point ?

Until that point is achieved, the above proposal is one of the simplest
to understand for the tweaking DBA, and the fastest to deploy when faced
with catastrophic plans. And I would guess it is one of the simplest to
be implemented and probably not very high maintenance either, although
this is just a guess.

If I could hint some of my queries, I would enable anonymous prepared
statements to take into account the parameter values, but I can't
because that results in runaway queries every now and then, so I had to
force postgres generate generic queries without knowing anything about
parameter values... so the effect for me is an overall slower postgres
system because I couldn't fix the particular problems I had and had to
tweak general settings. And when I have a problem I can't wait until the
planner is fixed, I have to solve it immediately... the current means to
do that are suboptimal.

The argument that planner hints would hide problems from being solved is
a fallacy. To put a hint in place almost the same amount of analysis is
needed from the DBA as solving the problem now, so users who ask now for
help will further do it even in the presence of hints. The ones who
wouldn't are not coming for help now either, they know their way out of
the problems... and the ones who still report a shortcoming of the
planner will do it with hints too.

I would even say it would be an added benefit, cause then you could
really see how well a specific plan will do without having the planner
capable to generate alone that plan... so knowledgeable users could come
to you further down the road when they know where the planner is wrong,
saving you time.

I must say it again, this kind of query-level hinting would be the
easiest to understand for the developers... there are many
trial-end-error type of programmers out there, if you got a hint wrong,
you fix it and move on, doesn't need to be perfect, it just have to be
good enough. I heavily doubt that postgres will get bad publicity
because user Joe sot himself in the foot by using bad hints... the
probability for that is low, you must actively put those hints there,
and if you take the time to do that then you're not the average Joe, and
probably not so lazy either, and if you're putting random hints, then
you would probably mess it up some other way anyway.

And the thing about missing new features is also not very founded. If I
would want to exclude a full table scan on a specific table for a
specific query, than that's about for sure that I want to do that
regardless what new features postgres will offer in the future. Picking
one specific access method is more prone to missing new access methods,
but even then, when I upgrade the DB server to a new version, I usually
have enough other compatibility problems (till now I always had some on
every upgrade I had) that making a round of upgrading hints is not an
outstanding problem. And if the application works good enough with
suboptimal plans, why would I even take that extra effort ?

I guess the angle is: I, as a practicing DBA would like to be able to
experiment and get most out of the imperfect tool I have, and you, the
developers, want to make the tool perfect... I don't care about perfect
tools, it just have to do the job... hints or anything else, if I can
make it work GOOD ENOUGH, it's all fine. And hints is something I would
understand and be able to use.

Thanks for your patience if you're still reading this...

Cheers,
Csaba.



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 06:16 AM
Tom Lane
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

"Jim C. Nasby" <jim@nasby.net> writes:
> Yes, but as I mentioned the idea here was to come up with something that
> is (hopefully) easy to define and implement. In other words, something
> that should be doable for 8.3.


Sorry, but that is not anywhere on my list of criteria for an important
feature. Having to live with a quick-and-dirty design for the
foreseeable future is an ugly prospect --- and anything that puts hints
into application code is going to lock us down to supporting it forever.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 06:16 AM
Josh Berkus
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

Jim,

>>> These hints would outright force the planner to do things a certain way.
>>> ... FROM table /* ACCESS {SEQSCAN | [[NO] BITMAP] INDEX index_name} */

>> This proposal seems to deliberately ignore every point that has been
>> made *against* doing things that way. It doesn't separate the hints
>> from the queries, it doesn't focus on fixing the statistical or cost
>> misestimates that are at the heart of the issue, and it takes no account
>> of the problem of hints being obsoleted by system improvements.

>
> Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See
> also my comment below.


I don't see how adding extra tags to queries is easier to implement than
an ability to modify the system catalogs. Quite the opposite, really.

And, as I said, if you're going to push for a feature that will be
obsolesced in one version, then you're going to have a really rocky row
to hoe.

> Yes, but as I mentioned the idea here was to come up with something that
> is (hopefully) easy to define and implement. In other words, something
> that should be doable for 8.3. Because this proposal essentially amounts
> to limiting plans the planner will consider and tweaking it's cost
> estimates, I'm hoping that it should be (relatively) easy to implement.


Even I, the chief marketing geek, am more concerned with getting a
feature that we will still be proud of in 5 years than getting one in
the next nine months. Keep your pants on!

I actually think the way to attack this issue is to discuss the kinds of
errors the planner makes, and what tweaks we could do to correct them.
Here's the ones I'm aware of:

-- Incorrect selectivity of WHERE clause
-- Incorrect selectivity of JOIN
-- Wrong estimate of rows returned from SRF
-- Incorrect cost estimate for index use

Can you think of any others?

I also feel that a tenet of the design of the "planner tweaks" system
ought to be that the tweaks are collectible and analyzable in some form.
This would allow DBAs to mail in their tweaks to -performance or
-hackers, and then allow us to continue improving the planner.

--Josh Berkus





---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 06:16 AM
Josh Berkus
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

Csaba,

> I guess the angle is: I, as a practicing DBA would like to be able to
> experiment and get most out of the imperfect tool I have, and you, the
> developers, want to make the tool perfect... I don't care about perfect
> tools, it just have to do the job... hints or anything else, if I can
> make it work GOOD ENOUGH, it's all fine. And hints is something I would
> understand and be able to use.


Hmmm, if you already understand Visual Basic syntax, should we support
that too? Or maybe we should support MySQL's use of '0000-00-00' as the
"zero" date because people "understand" that?

We're just not going to adopt a bad design because Oracle DBAs are used
to it. If we wanted to do that, we could shut down the project and
join a proprietary DB staff.

The current discussion is:

a) Planner tweaking is sometimes necessary;
b) Oracle HINTS are a bad design for planner tweaking;
c) Can we come up with a good design for planner tweaking?

So, how about suggestions for a good design?

--Josh Berkus


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 06:16 AM
Csaba Nagy
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

> Hmmm, if you already understand Visual Basic syntax, should we support
> that too? Or maybe we should support MySQL's use of '0000-00-00' as the
> "zero" date because people "understand" that?


You completely misunderstood me... I have no idea about oracle hints,
never used Oracle in fact. My company uses oracle, but I have only very
very limited contact with oracle issues, and never touched a hint.

I'm only talking about ease of use, learning curves, and complexity in
general. While I do like the idea of an all automatic system optimizer
which takes your query portofolio and analyzes the data based on those
queries and creates you all the indexes you need and all that, that's
not gonna happen soon, because it's a very complex thing to implement.

The alternative is that you take your query portofolio, analyze it
yourself, figure out what statistics you need, create indexes, tweak
queries, hint the planner for correlations and stuff... which is a
complex task, and if you have to tell the server about some correlations
with the phase of the moon, you're screwed cause there will never be any
DB engine which will understand that.

But you always can put the corresponding hint in the query when you know
the correlation is there...

The problem is that the application sometimes really knows better than
the server, when the correlations are not standard.

> We're just not going to adopt a bad design because Oracle DBAs are used
> to it. If we wanted to do that, we could shut down the project and
> join a proprietary DB staff.


I have really nothing to do with Oracle. I think you guys are simply too
blinded by Oracle hate... I don't care about Oracle.

> The current discussion is:
>
> a) Planner tweaking is sometimes necessary;
> b) Oracle HINTS are a bad design for planner tweaking;


While there are plenty of arguments you made against query level hints
(can we not call them Oracle-hints ?), there are plenty of users of
postgres who expressed they would like them. I guess they were tweaking
postgres installations when they needed it, and not Oracle
installations. I expressed it clearly that for me query level hinting
would give more control and better understanding of what I have to do
for the desired result. Perfect planning -> forget it, I only care about
good enough with reasonable tuning effort. If I have to tweak statistics
I will NEVER be sure postgres will not backfire on me again. On the
other hand if I say never do a seq scan on this table for this query, I
could be sure it won't...

> c) Can we come up with a good design for planner tweaking?


Angles again: good enough now is better for end users, but programmers
always go for perfect tomorrow... pity.

Cheers,
Csaba.



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 06:16 AM
Tom Lane
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

Csaba Nagy <nagy@ecircle-ag.com> writes:
> Until that point is achieved, the above proposal is one of the simplest
> to understand for the tweaking DBA, and the fastest to deploy when faced
> with catastrophic plans. And I would guess it is one of the simplest to
> be implemented and probably not very high maintenance either, although
> this is just a guess.


That guess is wrong ... but more to the point, if you think that "simple
and easy to implement" should be the overriding concern for designing a
new feature, see mysql. They've used that design approach for years and
look what a mess they've got. This project has traditionally done
things differently and I feel no need to change that mindset now.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 06:16 AM
Andrew Sullivan
 
Posts: n/a
Default Re: [PERFORM] Hints proposal

On Thu, Oct 12, 2006 at 11:25:25AM -0500, Jim C. Nasby wrote:
> Yes, but it does one key thing: allows DBAs to fix problems *NOW*. See
> also my comment below.


If I may argue in the other direction, speaking as one whose career
(if we may be generous enough to call it that) has been pretty much
exclusively on the operations end of things, I think that's an awful
idea.

There are two ways that quick-fix solve-the-problem-now hints are
going to be used. One is in the sort of one-off query that a DBA has
to run from time to time, that takes a long time, but that isn't
really a part of regular application load. The thing is, if you
already know your data well enough to provide a useful hint, you also
know your data well enough to work around the problem in the short
run (with some temp table tricks and the like).

The _other_ way it's going to be used is as a stealthy alteration to
regular behaviour, to solve a particular nasty performance problem
that happens to result on a given day. And every single time I've
seen anything like that done, the long term effect is always
monstrous. Two releases later, all your testing and careful
inspection and planning goes to naught one Saturday night at 3 am
(because we all know computers know what time it is _where you are_)
when the one-off trick that you pulled last quarter to solve the
manager's promise (which was made while out golfing, so nobody wrote
anything down) turns out to have a nasty effect now that the data
distribution is different. Or you think so. But now you're not
sure, because the code was tweaked a little to take some advantage of
something you now have because of the query plans that you ended up
getting because of the hint that was there because of the golf game,
so now if you start fiddling with the hints, maybe you break
something else. And you're tired, but the client is on the phone
from Hong King _right now_.

The second case is, from my experience, exactly the sort of thing you
want really a lot when the golf game is just over, and the sort of
thing you end up kicking yourself for in run-on sentences in the
middle of the night six months after the golf game is long since
forgotten.

The idea for knobs on the planner that allows the DBA to give
directed feedback, from which new planner enhancements can also come,
seems to me a really good idea. But any sort of quick and dirty hint
for right now gives me the willies.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:03 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com