vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, there's a topic that comes up from time to time on the lists, the problem that pgsql functions get planned only once and thereafter the same query plan is used until server shutdown or explicit recreation of the function. I'd like to implement a way of automatic function replanning. I can think of two possible approaches. 1. in a more general way: extend pg_proc by an interval column "ttl" or "replanAfter" and add a function declaration attribute to the parser "... STRICT STABLE REPLAN AFTER '3 days'::interval" + general approach, every language that can pre-compute plans can use this feature, the check can be done in one place for all languages - in fact only plpsql can do that at the moment (right?) and there is no other candidate for something similar at the moment - catalog change that also requires interval to be specially treated while bootstrapping - catalog would grow, every function would have the attribute though it is only applicable for a very low number of functions, let alone the number of functions that would actually use it in a typical installation 2. use the #option feature of plpgsql. Add the possibility to specify #option ttl '3 days' or #option replan-after '1 day 2 hours' + Minor changes, changes only local to plpgsql - plpgsql specific solution - is #option documented at all? Should it stay "unofficial"? If so, why? 3. (not automatic) add a statement that makes pgsql forget the plan and compute a new one when the function gets called again. "The user should rather use cron for doing maintenance tasks." What do you think? Any other ideas? Joachim ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Tue, 2005-12-13 at 22:32 +0100, Joachim Wieland wrote: > there's a topic that comes up from time to time on the lists, the problem > that pgsql functions get planned only once and thereafter the same query > plan is used until server shutdown or explicit recreation of the function. The problem really has nothing to do with functions, per se: whenever a plan is created and then stored for future use, the assumptions made by that plan may be invalidated by the time the plan is executed. This applies to PREPARE, pl/pgsql functions, perhaps the plan caching done by the RI triggers, and so forth. I also think that invalidating cached plans on a periodic basis is the wrong approach -- we can use sinval to invalidate plans as soon as a dependent database object changes and not before. This thread contains some ideas on how to do this: http://archives.postgresql.org/pgsql...3/msg00426.php I got somewhat sidetracked by the complexities of the "central plan caching module" that Tom would like to see, but I'm still hoping to take a look at this for 8.2. -Neil ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Tue, Dec 13, 2005 at 04:49:10PM -0500, Neil Conway wrote: > On Tue, 2005-12-13 at 22:32 +0100, Joachim Wieland wrote: > > there's a topic that comes up from time to time on the lists, the problem > > that pgsql functions get planned only once and thereafter the same query > > plan is used until server shutdown or explicit recreation of the function. > > The problem really has nothing to do with functions, per se: whenever a > plan is created and then stored for future use, the assumptions made by > that plan may be invalidated by the time the plan is executed. This > applies to PREPARE, pl/pgsql functions, perhaps the plan caching done by > the RI triggers, and so forth. > > I also think that invalidating cached plans on a periodic basis is the > wrong approach -- we can use sinval to invalidate plans as soon as a > dependent database object changes and not before. This thread contains > some ideas on how to do this: > > http://archives.postgresql.org/pgsql...3/msg00426.php > > I got somewhat sidetracked by the complexities of the "central plan > caching module" that Tom would like to see, but I'm still hoping to take > a look at this for 8.2. As for predicate-driven plan changes (ie: query is planned the first time with a predicate that has high cardinality, but there are also low cardinality values that will be queried on), it would make more sense to track the amount of work (probably tuples fetched) normally required to execute a prepared statement. Any time that prepared statement is executed with a set of predicates that substantially changes the amount of work required it should be remembered and considered for re-planning the next time the query is executed with those predicates. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Good idea, TODO updated: * Flush cached query plans when the dependent objects change or when the cardinality of parameters changes dramatically --------------------------------------------------------------------------- Jim C. Nasby wrote: > On Tue, Dec 13, 2005 at 04:49:10PM -0500, Neil Conway wrote: > > On Tue, 2005-12-13 at 22:32 +0100, Joachim Wieland wrote: > > > there's a topic that comes up from time to time on the lists, the problem > > > that pgsql functions get planned only once and thereafter the same query > > > plan is used until server shutdown or explicit recreation of the function. > > > > The problem really has nothing to do with functions, per se: whenever a > > plan is created and then stored for future use, the assumptions made by > > that plan may be invalidated by the time the plan is executed. This > > applies to PREPARE, pl/pgsql functions, perhaps the plan caching done by > > the RI triggers, and so forth. > > > > I also think that invalidating cached plans on a periodic basis is the > > wrong approach -- we can use sinval to invalidate plans as soon as a > > dependent database object changes and not before. This thread contains > > some ideas on how to do this: > > > > http://archives.postgresql.org/pgsql...3/msg00426.php > > > > I got somewhat sidetracked by the complexities of the "central plan > > caching module" that Tom would like to see, but I'm still hoping to take > > a look at this for 8.2. > > As for predicate-driven plan changes (ie: query is planned the first > time with a predicate that has high cardinality, but there are also low > cardinality values that will be queried on), it would make more sense to > track the amount of work (probably tuples fetched) normally required to > execute a prepared statement. Any time that prepared statement is > executed with a set of predicates that substantially changes the amount > of work required it should be remembered and considered for re-planning > the next time the query is executed with those predicates. > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Is cardinality the only thing we'd need to worry about? My idea was actually to track the amount of work normally required by a stored query plan, and if a query uses that plan but requires a very different amount of work it's a good indication that we either need to replan or store multiple plans for that query. Though if we're certain that cardinality is the only thing that could make a cached plan go bad it would certainly simplify things greatly. On Fri, Dec 16, 2005 at 11:10:43PM -0500, Bruce Momjian wrote: > > Good idea, TODO updated: > > * Flush cached query plans when the dependent objects change or > when the cardinality of parameters changes dramatically > > > --------------------------------------------------------------------------- > > Jim C. Nasby wrote: > > On Tue, Dec 13, 2005 at 04:49:10PM -0500, Neil Conway wrote: > > > On Tue, 2005-12-13 at 22:32 +0100, Joachim Wieland wrote: > > > > there's a topic that comes up from time to time on the lists, the problem > > > > that pgsql functions get planned only once and thereafter the same query > > > > plan is used until server shutdown or explicit recreation of the function. > > > > > > The problem really has nothing to do with functions, per se: whenever a > > > plan is created and then stored for future use, the assumptions made by > > > that plan may be invalidated by the time the plan is executed. This > > > applies to PREPARE, pl/pgsql functions, perhaps the plan caching done by > > > the RI triggers, and so forth. > > > > > > I also think that invalidating cached plans on a periodic basis is the > > > wrong approach -- we can use sinval to invalidate plans as soon as a > > > dependent database object changes and not before. This thread contains > > > some ideas on how to do this: > > > > > > http://archives.postgresql.org/pgsql...3/msg00426.php > > > > > > I got somewhat sidetracked by the complexities of the "central plan > > > caching module" that Tom would like to see, but I'm still hoping to take > > > a look at this for 8.2. > > > > As for predicate-driven plan changes (ie: query is planned the first > > time with a predicate that has high cardinality, but there are also low > > cardinality values that will be queried on), it would make more sense to > > track the amount of work (probably tuples fetched) normally required to > > execute a prepared statement. Any time that prepared statement is > > executed with a set of predicates that substantially changes the amount > > of work required it should be remembered and considered for re-planning > > the next time the query is executed with those predicates. > > -- > > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > > Pervasive Software http://pervasive.com work: 512-231-6117 > > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Jim C. Nasby wrote: > Is cardinality the only thing we'd need to worry about? My idea was > actually to track the amount of work normally required by a stored query > plan, and if a query uses that plan but requires a very different amount > of work it's a good indication that we either need to replan or store > multiple plans for that query. Though if we're certain that cardinality > is the only thing that could make a cached plan go bad it would > certainly simplify things greatly. This gets into another area of re-optimizing when the executor finds that the actual tables don't match the optimizer estimates. I think we decided that was too hard/risky, but invalidating the plan might help, though I am thinking re-planning might just generate the same plan as before. I think something would need to have happened since the last planning, like ANALYZE or something. Updated TODO: * Flush cached query plans when the dependent objects change, when the cardinality of parameters changes dramatically, or when new ANALYZE statistics are available --------------------------------------------------------------------------- > > On Fri, Dec 16, 2005 at 11:10:43PM -0500, Bruce Momjian wrote: > > > > Good idea, TODO updated: > > > > * Flush cached query plans when the dependent objects change or > > when the cardinality of parameters changes dramatically > > > > > > --------------------------------------------------------------------------- > > > > Jim C. Nasby wrote: > > > On Tue, Dec 13, 2005 at 04:49:10PM -0500, Neil Conway wrote: > > > > On Tue, 2005-12-13 at 22:32 +0100, Joachim Wieland wrote: > > > > > there's a topic that comes up from time to time on the lists, the problem > > > > > that pgsql functions get planned only once and thereafter the same query > > > > > plan is used until server shutdown or explicit recreation of the function. > > > > > > > > The problem really has nothing to do with functions, per se: whenever a > > > > plan is created and then stored for future use, the assumptions made by > > > > that plan may be invalidated by the time the plan is executed. This > > > > applies to PREPARE, pl/pgsql functions, perhaps the plan caching done by > > > > the RI triggers, and so forth. > > > > > > > > I also think that invalidating cached plans on a periodic basis is the > > > > wrong approach -- we can use sinval to invalidate plans as soon as a > > > > dependent database object changes and not before. This thread contains > > > > some ideas on how to do this: > > > > > > > > http://archives.postgresql.org/pgsql...3/msg00426.php > > > > > > > > I got somewhat sidetracked by the complexities of the "central plan > > > > caching module" that Tom would like to see, but I'm still hoping to take > > > > a look at this for 8.2. > > > > > > As for predicate-driven plan changes (ie: query is planned the first > > > time with a predicate that has high cardinality, but there are also low > > > cardinality values that will be queried on), it would make more sense to > > > track the amount of work (probably tuples fetched) normally required to > > > execute a prepared statement. Any time that prepared statement is > > > executed with a set of predicates that substantially changes the amount > > > of work required it should be remembered and considered for re-planning > > > the next time the query is executed with those predicates. > > > -- > > > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > > > Pervasive Software http://pervasive.com work: 512-231-6117 > > > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 4: Have you searched our list archives? > > > > > > http://archives.postgresql.org > > > > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: don't forget to increase your free space map settings > > > > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Bruce Momjian wrote: > * Flush cached query plans when the dependent objects change, > when the cardinality of parameters changes dramatically, or > when new ANALYZE statistics are available Wouldn't it also make sense to flush a cached query plan when after execution it is determined that one or more assumptions that the cached query plan was based on was found to be off? Like the query plan was based on the assumption that a particular table would only return a hand full of rows, but in reality it returned a few thousand. regards, Lukas |
| |||
| Lukas Smith <mls@pooteeweet.org> writes: > Bruce Momjian wrote: > >> * Flush cached query plans when the dependent objects change, >> when the cardinality of parameters changes dramatically, or >> when new ANALYZE statistics are available > > Wouldn't it also make sense to flush a cached query plan when after > execution it is determined that one or more assumptions that the > cached query plan was based on was found to be off? Like the query > plan was based on the assumption that a particular table would only > return a hand full of rows, but in reality it returned a few > thousand. There is some merit to that. I could also see it being sensible to flush a cached plan any time the query took more than some [arbitrary/GUC-chosen] interval. Supposing it took 20s to execute the query, it would surely seem surprising for re-evaluating the plan to be expected to make up a material proportion of the cost of the *next* invocation. If we flush every plan that took >10s to evaluate, that offers the possibility for it to be done better next time... -- output = reverse("gro.mca" "@" "enworbbc") http://www.ntlug.org/~cbbrowne/internet.html Points are awarded for getting the last word in. Drawing the conversation out so long that the original message disappears due to being indented off the right hand edge of the screen is one way to do this. Another is to imply that anyone replying further is a hopeless cretin and is wasting everyone's valuable time. -- from the Symbolics Guidelines for Sending Mail |
| |||
| Chris Browne wrote: > Lukas Smith <mls@pooteeweet.org> writes: > > Bruce Momjian wrote: > > > >> * Flush cached query plans when the dependent objects change, > >> when the cardinality of parameters changes dramatically, or > >> when new ANALYZE statistics are available > > > > Wouldn't it also make sense to flush a cached query plan when after > > execution it is determined that one or more assumptions that the > > cached query plan was based on was found to be off? Like the query > > plan was based on the assumption that a particular table would only > > return a hand full of rows, but in reality it returned a few > > thousand. > > There is some merit to that. > > I could also see it being sensible to flush a cached plan any time the > query took more than some [arbitrary/GUC-chosen] interval. > > Supposing it took 20s to execute the query, it would surely seem > surprising for re-evaluating the plan to be expected to make up a > material proportion of the cost of the *next* invocation. > > If we flush every plan that took >10s to evaluate, that offers the > possibility for it to be done better next time... Ah, but how do you pass that information back to the optimizer so you don't end up getting the same plan again? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| Lukas Smith <mls@pooteeweet.org> writes: > Bruce Momjian wrote: >> * Flush cached query plans when the dependent objects change, >> when the cardinality of parameters changes dramatically, or >> when new ANALYZE statistics are available > Wouldn't it also make sense to flush a cached query plan when after > execution it is determined that one or more assumptions that the cached > query plan was based on was found to be off? Not unless you do something that would cause the planner to make different choices next time. (Such as changing the ANALYZE statistics, perhaps.) The TODO item is OK as stated, it's just talking about mechanism and not the things that might trigger the mechanism. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |