This is a discussion on 8.3devel slower than 8.2 under read-only load within the pgsql Hackers forums, part of the PostgreSQL category; --> On Nov 26, 2007 6:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In the test case Guillame provided, every single ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Nov 26, 2007 6:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > In the test case Guillame provided, every single WHERE clause happens > to be of the form > varchar_column = 'unknown-type literal' > and there are no other operators used in the SELECT lists; but I can > hardly believe that this is representative of any significant number > of real-world applications. Even pgbench uses more than one operator. Sure. The application uses a lot of other operators (timestamp comparison, cube operators, LIKE, boolean comparison...). But as the primary keys are all varchar in this application, the operator varchar = unknown is by far the most used. If we can't find a solution for upstream 8.3, I'll be happy to apply a local patch for this customer (if I'm sure it doesn't do any harm in their case). Usually, people use integer flavoured primary keys so they probably won't be hit by this problem as strong as we are. -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Are we likely to see the 3% or the 7% performance degradation with > version 8.3? Probably not, since it sounds like your queries are typically not as trivial as the ones in Guillame's test case. IOW there will be some slowdown but it's likely to be in the noise for you, and also very likely made up by improvements elsewhere. The test case is in the unfortunate position of not being helped materially by *any* of the work we've done for 8.3. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Mon, 2007-11-26 at 09:55 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > But I think there must be an action that we can take for 8.3 and that > > much runtime should not be given away easily. ISTM that we can win back > > the losses Guillaume has identified, plus gain a little more even. > > Perhaps some sanity could be restored to this discussion by pointing out > that the 2007-01-01 code *also* clocks in at 37% spent in > oper_select_candidate. IOW it's been like this for a very long time. [I'm replying to the wrong message, I know.] Here's where I am: Basic test was to replace call to oper_select_candidate with a single item that was fed by a hardcoded value for varchar equality operator. This is the oper_cache.v1.patch enclosed; the 5 line patch. Test results were - w/o patch ~10,500 tps with pgbench_varchar.sql - with patch ~15,500 tps with pgbench_varchar.sql (**big gain**) - w/o patch ~16,250 tps with pgbench_integer.sql - with patch ~16,250 tps with pgbench_integer.sql Tables are standard pgbench, varchar test table created using: create table av as select aid::varchar, bid, abalance, filler from accounts; create unique index av_pkey on av (aid); The impact of calling oper_select_candidate() is big enough that it will affect any query that is read only and has 1 or 2 predicates when at least one of them is a VARCHAR_col = const query. What I'm actually proposing is a patch implementing a oper_select_hook function pointer, which allows the user to do anything they want. I'm just re-writing that as a plugin now, but the backend patch is included here for discussion. oper_select_hook.v1.patch -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Simon Riggs <simon@2ndquadrant.com> writes: > Here's where I am: > Basic test was to replace call to oper_select_candidate with a single > item that was fed by a hardcoded value for varchar equality operator. Well, that confirms what we knew from gprof, but surely you aren't proposing that as a usable patch. > What I'm actually proposing is a patch implementing a oper_select_hook > function pointer, which allows the user to do anything they want. Why in the world would that be a good idea? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Mon, 2007-11-26 at 18:18 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Here's where I am: > > > Basic test was to replace call to oper_select_candidate with a single > > item that was fed by a hardcoded value for varchar equality operator. > > Well, that confirms what we knew from gprof, but surely you aren't > proposing that as a usable patch. gprof might not have translated into a usable gain, but clearly it can. That's not a proposed patch, just showing my results. > > What I'm actually proposing is a patch implementing a oper_select_hook > > function pointer, which allows the user to do anything they want. > > Why in the world would that be a good idea? Short answer: it makes it go faster? You asked. ;-) Long answer: We all agree the operator cache is the best answer, yet don't wish to delay the project or make it less robust. The best answer is a plugin approach that lets users take the risk and make the gain. We can't hardcode it for everybody because that runs completely against the grain of Postgres. Including this as a plugin allows people to make their own decisions about cacheing/hardcoding. If you are the unlucky owner of a database with a heavy read workload and lots of VARCHAR keys then you're going to want this. The plugin allows writing a one-slot cache that is never flushed. If you choose to override the operators then you'd need to reconnect. It also allows some performance tuning in other cases too, so having it as a general case makes sense. The overhead of implementing it this way is very close to zero and the code path doesn't even get called in the integers-as-keys cases. I don't really like all of this, but that much gain is too much for me to ignore. Better ideas eagerly accepted, and encouraged. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > But I think there must be an action that we can take for 8.3 and that > > much runtime should not be given away easily. ISTM that we can win back > > the losses Guillaume has identified, plus gain a little more even. > > Perhaps some sanity could be restored to this discussion by pointing out > that the 2007-01-01 code *also* clocks in at 37% spent in > oper_select_candidate. IOW it's been like this for a very long time. > I'm not interested in destabilizing 8.3 with panicky last-minute patches. > > > So how about we have a cache-of-one: > > Cache-of-one has exactly the same difficulty as cache-of-many, other > than the table lookup itself, which is a solved problem (hashtable). > You still have to determine how you identify the cached value and what > events require a cache flush. Nor do I see any particular reason to > assume that a cache of only one operator would be of any use for > real-world apps, as opposed to toy examples. Seems like anytime a function like that takes 37%, there is something wrong. Are we sure there isn't a bug in there somewhere? As far as a cache, could we create a simple cache that remembered the last X lookups and cleared the cache anytime a cache invalidation message came in? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Tom, On Nov 27, 2007 3:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Please try this patch on your real app (not the dumbed-down test case) > and see what it does for you ... If I disable the cache in the application, the most visited page generates 175 SQL queries, mix of simple and more complicated queries so it's quite "real life" (it's a page which aggregates quite a lot of data - available in production here: http://www.fra.cityvox.fr/guide_lyon/AccueilVille ). The average of 5 runs for this page gives: 1.108 second for 8.2, 1.054 second for 8.3devel with your cache lookup patch. Which gives us 8.3devel+cache patch a bit faster (~5%) than 8.2 on a quite representative set of queries, even with no concurrency at all. That's very good news. Without your last patch but with the set of patches you commited these last days (basically latest CVS tip), 8.3devel is nearly as fast as 8.2 to generate this page but consistently a bit slower. For historical purposes, here are the results of my simple and dumb test case (pgbench -c 10 -n 10000 with the custom queries I mentioned earlier): 8.2: 1480 tps 2007-11-01: 1200 tps 2007-11-27: 1420 tps (after your set of patches to simplify the planner for simple queries) 2007-11-27 + cache lookup patch: 2260 tps That said, I checked nearly each of the 175 queries and I still find a few queries a bit slower. I don't know if it's worth digging but as you already found a couple of problems, I prefer mentioning it. For example, after simplification of a query with a few joins which is slower in 8.3devel+patch, I have the following results (consistent over several runs of the query): ** 8.2 ** => SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; <snip 4 rows> Time: 5.659 ms => EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Index Scan using idx_newsplanning_numnews_codelang_dfin_ddeb_codeth _niveau on newsplanning np (cost=0.00..752.95 rows=50 width=4) (actual time=4.478..5.029 rows=4 loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text = 'THE'::text) AND (niveau = 1)) Filter: ((date_trunc('day'::text, dfin) >= date_trunc('day'::text, now())) AND (date_trunc('day'::text, ddeb) <= date_trunc('day'::text, now()))) Total runtime: 5.064 ms ** 8.3devel with cache lookup ** =# SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; <snip 4 rows> Time: 5.932 ms EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)>= date_trunc('day',now()) AND date_trunc('day',NP.ddeb) <= date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Index Scan using idx_newsplanning_numnews_codelang_dfin_ddeb_codeth _niveau on newsplanning np (cost=0.00..679.35 rows=46 width=4) (actual time=4.884..5.467 rows=4 loops=1) Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text = 'THE'::text) AND (niveau = 1)) Filter: ((date_trunc('day'::text, dfin) >= date_trunc('day'::text, now())) AND (date_trunc('day'::text, ddeb) <= date_trunc('day'::text, now()))) Total runtime: 5.501 ms (4 rows) The schema of the table is the following: => \d newsplanning Table "public.newsplanning" Column | Type | Modifiers ---------------+-----------------------------+-------------------- idplanning | integer | not null numnews | integer | not null ddeb | timestamp without time zone | not null dfin | timestamp without time zone | not null codeth | character varying(3) | not null niveau | integer | not null ordre | integer | codelang | character varying(3) | not null ddermodif | timestamp without time zone | logindermodif | character varying(20) | codejour | integer | not null default 1 Indexes: "pk_newsplanning" PRIMARY KEY, btree (idplanning) "idx_newsplanning_ddeb_dfin_numnews_niveau_codelan g" btree (ddeb, dfin, numnews, niveau, codelang) "idx_newsplanning_numnews_codelang_dfin_ddeb_codet h_niveau" btree (numnews, codelang, dfin, ddeb, codeth, niveau) Foreign-key constraints: "fk_newsplanning_codejour" FOREIGN KEY (codejour) REFERENCES jours(codejour) ON DELETE CASCADE "fk_newsplanning_ref_codeth" FOREIGN KEY (codeth) REFERENCES theme(codeth) ON DELETE CASCADE "fk_newsplanning_ref_numnews" FOREIGN KEY (numnews) REFERENCES news(numnews) ON DELETE CASCADE "fk_newsplanning_ref_ord" FOREIGN KEY (niveau) REFERENCES ordnews(ordnews) ON UPDATE RESTRICT ON DELETE RESTRICT I can provide test data in private if needed. -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| For archives, Tom commited the patch yesterday: http://archives.postgresql.org/pgsql...1/msg00552.php -- Guillaume ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| Thread Tools | |
| Display Modes | |
|
|