This is a discussion on Re: why is bitmap index chosen for this query? within the Pgsql Performance forums, part of the PostgreSQL category; --> "Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote: It sounds like PostgreSQL badly overestimates the cost of the index scan. Does the ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| "Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote: It sounds like PostgreSQL badly overestimates the cost of the index scan. Does the table perchance fit completely into memory, without effective_cache_size indicating that? Don't know the exact way to answer your question, but my initial instinct is "no way." select pg_relation_size('packets'); pg_relation_size ------------------ 19440115712 19GB. So it's a big table. The query I submitted earlier returns about 13M rows and the table currently has about 38M rows. --------------------------------- Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min with Yahoo! Messenger with Voice. |
| |||
| On Thu, May 18, 2006 at 09:41:23AM -0700, Stephen Byers wrote: > Does the table perchance fit completely into memory, without > effective_cache_size indicating that? > > Don't know the exact way to answer your question, but my initial instinct is "no way." What about the working set? Have you tried running the queries multiple times in a row to see if the results change? It might be that your initial bitmap scan puts all the relevant bits into cache, which will skew the results. /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| "Steinar H. Gunderson" <sgunderson@bigfoot.com> writes: > What about the working set? Have you tried running the queries multiple times > in a row to see if the results change? It might be that your initial bitmap > scan puts all the relevant bits into cache, which will skew the results. If the examples were done in the order shown, the seqscan ought to have pretty well blown out the cache ... but I concur that it'd be interesting to check whether repeated executions of the same plan show markedly different times. Also, is the index order closely correlated to the actual physical table order? What is work_mem set to, and does increasing it substantially make the bitmap scan work any better? Considering that the query is fetching about half of the table, I'd have thought that the planner was correct to estimate that bitmap or seqscan ought to win. For the plain indexscan to win, the order correlation must be quite strong, and I'm also guessing that the bitmap scan must have run into some substantial trouble (like discarding a lot of info because of lack of work_mem). IIRC, the planner doesn't currently try to model the effects of a bitmap scan going into lossy mode, which is something it probably should try to account for. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| I repeated explain analyze on the query 5 times and it came up with the same plan. You asked about index order and physical table order. In general the index order is indeed close to the same order as the physical table order. However, this query is likely an exception. The data is actually from a backup server that has filled a hole for some of the time range that I'm specifying in my query. Work_mem was set to 10240. After your suggestion, I bumped it to 102400 and it looks like it did significantly impact performance. adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets adbs_db-# where environment_name='PASITCTX01' adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on packets (cost=247205.64..2838556.55 rows=12473252 width=47) (actual time=32118.943..187075.742 rows=13365371 loops=1) Recheck Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800)) -> Bitmap Index Scan on packets_i4 (cost=0.00..247205.64 rows=12473252 width=0) (actual time=30370.789..30370.789 rows=13365371 loops=1) Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800)) Total runtime: 191995.431 ms (5 rows) adbs_db=# adbs_db=# adbs_db=# adbs_db=# set enable_bitmapscan to off; SET adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets adbs_db-# where environment_name='PASITCTX01' adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on packets (cost=0.00..3046021.47 rows=12473252 width=47) (actual time=56616.457..475839.789 rows=13365371 loops=1) Filter: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800)) Total runtime: 481228.409 ms (3 rows) adbs_db=# adbs_db=# adbs_db=# adbs_db=# set enable_seqscan to off; SET adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets adbs_db-# where environment_name='PASITCTX01' adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using packets_i4 on packets (cost=0.00..19909080.77 rows=12473273 width=47) (actual time=3.511..188273.177 rows=13365371 loops=1) Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800)) Total runtime: 194061.497 ms (3 rows) Wow -- so what does that mean? Do I need to leave my work_mem at 100MB?? I mentioned that my application actually uses a cursor to walk through this data. Even though the bitmap scan technically had the fastest time with explain analyze, it takes a long while (20 seconds) before the results start to come back through the cursor. Conversely, with the index scan, results immediately come back through the cursor method (which is more desirable). Thoughts? Example: begin; declare myCursor cursor for select spacecraft_time_secs,mnemonic,volume_label from packets where environment_name='PASITCTX01' and system_time_secs>=1132272000 and system_time_secs<=1143244800; fetch 10 from myCursor; end; PS, this is on a Sun Fire V240 with 4GB RAM, Solaris 8. Thanks, Steve Tom Lane <tgl@sss.pgh.pa.us> wrote: "Steinar H. Gunderson" writes: > What about the working set? Have you tried running the queries multiple times > in a row to see if the results change? It might be that your initial bitmap > scan puts all the relevant bits into cache, which will skew the results. If the examples were done in the order shown, the seqscan ought to have pretty well blown out the cache ... but I concur that it'd be interesting to check whether repeated executions of the same plan show markedly different times. Also, is the index order closely correlated to the actual physical table order? What is work_mem set to, and does increasing it substantially make the bitmap scan work any better? Considering that the query is fetching about half of the table, I'd have thought that the planner was correct to estimate that bitmap or seqscan ought to win. For the plain indexscan to win, the order correlation must be quite strong, and I'm also guessing that the bitmap scan must have run into some substantial trouble (like discarding a lot of info because of lack of work_mem). IIRC, the planner doesn't currently try to model the effects of a bitmap scan going into lossy mode, which is something it probably should try to account for. regards, tom lane --------------------------------- Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better. |
| |||
| On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote: > I repeated explain analyze on the query 5 times and it came up with the same plan. Yes, but did it end up with the same runtime? That's the interesting part -- the plan will almost always be identical between explain analyze runs given that you haven't done anything in between them. > You asked about index order and physical table order. In general the index > order is indeed close to the same order as the physical table order. > However, this query is likely an exception. The data is actually from a > backup server that has filled a hole for some of the time range that I'm > specifying in my query. Well, it still isn't all that far-fetched to believe that the data has lots of correlation (which helps the index scan quite a lot) that the planner isn't able to pick up. I don't know the details here, so I can't tell you how the correlation for such a query (WHERE a=foo and b between bar and baz) is estimated. Something tells me someone else might, though. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Yes, here are the runtimes for the repeated query. Total runtime: 748716.750 ms Total runtime: 749170.934 ms Total runtime: 744113.594 ms Total runtime: 746314.251 ms Total runtime: 742083.732 ms Thanks, Steve "Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote: On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote: > I repeated explain analyze on the query 5 times and it came up with the same plan. Yes, but did it end up with the same runtime? That's the interesting part -- the plan will almost always be identical between explain analyze runs given that you haven't done anything in between them. > You asked about index order and physical table order. In general the index > order is indeed close to the same order as the physical table order. > However, this query is likely an exception. The data is actually from a > backup server that has filled a hole for some of the time range that I'm > specifying in my query. Well, it still isn't all that far-fetched to believe that the data has lots of correlation (which helps the index scan quite a lot) that the planner isn't able to pick up. I don't know the details here, so I can't tell you how the correlation for such a query (WHERE a=foo and b between bar and baz) is estimated. Something tells me someone else might, though. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org --------------------------------- Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better. |
| |||
| On Thu, May 18, 2006 at 12:53:16PM -0700, Stephen Byers wrote: > Yes, here are the runtimes for the repeated query. > Total runtime: 748716.750 ms > Total runtime: 749170.934 ms > Total runtime: 744113.594 ms > Total runtime: 746314.251 ms > Total runtime: 742083.732 ms With which options enabled? This isn't even close to any of the three times you already posted. /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| You may be comparing the values to Tom's suggestion to bump up work_mem. Take a look at the original posting (Total runtime: 777208.041 ms for the bitmap scan) -Steve "Steinar H. Gunderson" <sgunderson@bigfoot.com> wrote: On Thu, May 18, 2006 at 12:53:16PM -0700, Stephen Byers wrote: > Yes, here are the runtimes for the repeated query. > Total runtime: 748716.750 ms > Total runtime: 749170.934 ms > Total runtime: 744113.594 ms > Total runtime: 746314.251 ms > Total runtime: 742083.732 ms With which options enabled? This isn't even close to any of the three times you already posted. /* Steinar */ -- Homepage: http://www.sesse.net/ ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly --------------------------------- Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less. |
| ||||
| On Thu, May 18, 2006 at 12:38:18PM -0700, Stephen Byers wrote: > I repeated explain analyze on the query 5 times and it came up with the same plan. > > You asked about index order and physical table order. In general the index order is indeed close to the same order as the physical table order. However, this query is likely an exception. The data is actually from a backup server that has filled a hole for some of the time range that I'm specifying in my query. What's SELECT correlation FROM pg_stats WHERE tablename='packets' AND attname='environment_name' show? What's effective_cache_size and random_page_cost set to? Also, out of curiosity, why not just use a timestamp instead of two int's for storing time? > Wow -- so what does that mean? Do I need to leave my work_mem at 100MB?? I mentioned that my application actually uses a cursor to walk through this data. Even though the bitmap scan technically had the fastest time with explain analyze, it takes a long while (20 seconds) before the results start to come back through the cursor. Conversely, with the index scan, results immediately come back through the cursor method (which is more desirable). Thoughts? Do you really need to use a cursor? It's generally less efficient than doing things with a single SQL statement, depending on what exactly you're doing. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |