This is a discussion on Avoiding a seq scan on a table. within the pgsql Novice forums, part of the PostgreSQL category; --> I am trying to create and index that will prevent a seq scan table. The querey that is causing ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I am trying to create and index that will prevent a seq scan table. The querey that is causing the seq scan is this SELECT COUNT(*) FROM issuetracking where (issue_target='david' OR manager='david') AND date_closed=null; Any ideas on what Any suggestions on what index I can add that will make this not a seq scan? ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Thanks would you suggest a btree or a hash? My guess would a hash since it uses an =. On Jan 14, 2008 11:41 AM, Brian Hurt <bhurt@janestcapital.com> wrote: > > LWATCDR wrote: > > >I am trying to create and index that will prevent a seq scan table. > >The querey that is causing the seq scan is this SELECT COUNT(*) FROM > >issuetracking where (issue_target='david' OR manager='david') AND > >date_closed=null; > >Any ideas on what > >Any suggestions on what index I can add that will make this not a seq scan? > > > >---------------------------(end of broadcast)--------------------------- > >TIP 5: don't forget to increase your free space map settings > > > > > > > > > I would recommend making three indexes- one on issue_target, one on > manager, and one on date_closed. Postgres can then do a trick where it > turns the indexes into bitscan indexes (with one "bit" per page as to > wether that page might have a row of interest or not), which it can then > bitwise and and or combine together. > > Don't forget to analyze the table after making the indexes. > > Brian > > ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Jan 14, 2008 11:45 AM, LWATCDR <lwatcdr@gmail.com> wrote: > Thanks would you suggest a btree or a hash? My guess would a hash > since it uses an =. > You can pretty much ignore hash indexes in Postgres. They are, in nearly every case (every case that I know of), slower than btree. Just make the indexes using the default indexing scheme. Again, do not forget to analyze the table after creating the indexes. Sean |
| |||
| LWATCDR wrote: >Thanks would you suggest a btree or a hash? My guess would a hash >since it uses an =. > > I only use btrees, so I can't speak to that. Brian ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Really? From what I have done in writing my own code I have found hashing to be faster than a btree but then when I wrote my own hashing it was a specific type of key. Anyway I put in the tree indexes and I am still getting a seq scan. Aggregate (cost=12.12..12.13 rows=1 width=0) -> Result (cost=0.00..12.12 rows=1 width=0) One-Time Filter: NULL::boolean -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) Filter: (((issue_target)::text = 'david'::text) OR ((manager)::text = 'david'::text)) On Jan 14, 2008 11:54 AM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > > On Jan 14, 2008 11:45 AM, LWATCDR <lwatcdr@gmail.com> wrote: > > Thanks would you suggest a btree or a hash? My guess would a hash > > since it uses an =. > > > > You can pretty much ignore hash indexes in Postgres. They are, in nearly > every case (every case that I know of), slower than btree. Just make the > indexes using the default indexing scheme. Again, do not forget to analyze > the table after creating the indexes. > > Sean > > > ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Mon, January 14, 2008 12:14 pm, LWATCDR wrote: > Really? From what I have done in writing my own code I have found > hashing to be faster than a btree but then when I wrote my own hashing > it was a specific type of key. > Anyway I put in the tree indexes and I am still getting a seq scan. > > Aggregate (cost=12.12..12.13 rows=1 width=0) > -> Result (cost=0.00..12.12 rows=1 width=0) > One-Time Filter: NULL::boolean > -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) > Filter: (((issue_target)::text = 'david'::text) OR > ((manager)::text = 'david'::text)) Based on that cost, a sequence scan is probably the fastest yet: It's such a small dataset that fetching the index and working with it before going back and fetching the data is just overkill. When you add a few dozen more rows or so, it'll switch to using the index. Daniel T. Staal --------------------------------------------------------------- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. --------------------------------------------------------------- ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Jan 14, 2008 12:14 PM, LWATCDR <lwatcdr@gmail.com> wrote: > Really? From what I have done in writing my own code I have found > hashing to be faster than a btree but then when I wrote my own hashing > it was a specific type of key. > Anyway I put in the tree indexes and I am still getting a seq scan. > > Aggregate (cost=12.12..12.13 rows=1 width=0) > -> Result (cost=0.00..12.12 rows=1 width=0) > One-Time Filter: NULL::boolean > -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) > Filter: (((issue_target)::text = 'david'::text) OR > ((manager)::text = 'david'::text)) The Postgres planner will choose what it thinks is the fastest plan. In this case, your table has only 1 row (?), so sequential scan will be fastest. You will want to load data into your table before doing benchmarking. Sean |
| |||
| On Monday 14 January 2008, LWATCDR <lwatcdr@gmail.com> wrote: > Really? From what I have done in writing my own code I have found > hashing to be faster than a btree but then when I wrote my own hashing > it was a specific type of key. > Anyway I put in the tree indexes and I am still getting a seq scan. > > Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) The planner will always choose a seq scan when your table size is very small. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| LWATCDR wrote: >Really? From what I have done in writing my own code I have found >hashing to be faster than a btree but then when I wrote my own hashing >it was a specific type of key. >Anyway I put in the tree indexes and I am still getting a seq scan. > >Aggregate (cost=12.12..12.13 rows=1 width=0) > -> Result (cost=0.00..12.12 rows=1 width=0) > One-Time Filter: NULL::boolean > -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) > Filter: (((issue_target)::text = 'david'::text) OR >((manager)::text = 'david'::text)) > > > > For very small tables, Postgres will skip reading the indexes, because it's not worth it. Postgres thinks it's only going to have to read 12 pages or so. At which point it'll likely have to read all the pages anyways, so why also read the index? Brian ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| that is very odd since that table has 141 records in it. here is a different query that I ran. SELECT COUNT(*) FROM rma where terminatedate is NULL; This returns a value of 254 for the count but this is what I get from explain. Aggregate (cost=219.77..219.78 rows=1 width=0) -> Seq Scan on rma (cost=0.00..219.11 rows=264 width=0) Filter: (terminatedate IS NULL) This says that rows =1 but returns 254 rows of data? The table contains over 7000 rows. On Jan 14, 2008 12:22 PM, Daniel T. Staal <DStaal@usa.net> wrote: > > On Mon, January 14, 2008 12:14 pm, LWATCDR wrote: > > Really? From what I have done in writing my own code I have found > > hashing to be faster than a btree but then when I wrote my own hashing > > it was a specific type of key. > > Anyway I put in the tree indexes and I am still getting a seq scan. > > > > Aggregate (cost=12.12..12.13 rows=1 width=0) > > -> Result (cost=0.00..12.12 rows=1 width=0) > > One-Time Filter: NULL::boolean > > -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0) > > Filter: (((issue_target)::text = 'david'::text) OR > > ((manager)::text = 'david'::text)) > > Based on that cost, a sequence scan is probably the fastest yet: It's such > a small dataset that fetching the index and working with it before going > back and fetching the data is just overkill. > > When you add a few dozen more rows or so, it'll switch to using the index. > > Daniel T. Staal > > --------------------------------------------------------------- > This email copyright the author. Unless otherwise noted, you > are expressly allowed to retransmit, quote, or otherwise use > the contents for non-commercial purposes. This copyright will > expire 5 years after the author's death, or in 30 years, > whichever is longer, unless such a period is in excess of > local copyright law. > --------------------------------------------------------------- > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |