Unix Technical Forum

Avoiding a seq scan on a table.

This is a discussion on Avoiding a seq scan on a table. within the pgsql Novice forums, part of the PostgreSQL category; --> I am trying to create and index that will prevent a seq scan table. The querey that is causing ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Novice

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-17-2008, 10:51 PM
LWATCDR
 
Posts: n/a
Default Avoiding a seq scan on a table.

I am trying to create and index that will prevent a seq scan table.
The querey that is causing the seq scan is this SELECT COUNT(*) FROM
issuetracking where (issue_target='david' OR manager='david') AND
date_closed=null;
Any ideas on what
Any suggestions on what index I can add that will make this not a seq scan?

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-17-2008, 10:51 PM
LWATCDR
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

Thanks would you suggest a btree or a hash? My guess would a hash
since it uses an =.

On Jan 14, 2008 11:41 AM, Brian Hurt <bhurt@janestcapital.com> wrote:
>
> LWATCDR wrote:
>
> >I am trying to create and index that will prevent a seq scan table.
> >The querey that is causing the seq scan is this SELECT COUNT(*) FROM
> >issuetracking where (issue_target='david' OR manager='david') AND
> >date_closed=null;
> >Any ideas on what
> >Any suggestions on what index I can add that will make this not a seq scan?
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 5: don't forget to increase your free space map settings
> >
> >
> >
> >

> I would recommend making three indexes- one on issue_target, one on
> manager, and one on date_closed. Postgres can then do a trick where it
> turns the indexes into bitscan indexes (with one "bit" per page as to
> wether that page might have a row of interest or not), which it can then
> bitwise and and or combine together.
>
> Don't forget to analyze the table after making the indexes.
>
> Brian
>
>


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-17-2008, 10:51 PM
Sean Davis
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

On Jan 14, 2008 11:45 AM, LWATCDR <lwatcdr@gmail.com> wrote:

> Thanks would you suggest a btree or a hash? My guess would a hash
> since it uses an =.
>


You can pretty much ignore hash indexes in Postgres. They are, in nearly
every case (every case that I know of), slower than btree. Just make the
indexes using the default indexing scheme. Again, do not forget to analyze
the table after creating the indexes.

Sean

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-17-2008, 10:51 PM
Brian Hurt
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

LWATCDR wrote:

>Thanks would you suggest a btree or a hash? My guess would a hash
>since it uses an =.
>
>

I only use btrees, so I can't speak to that.

Brian


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-17-2008, 10:51 PM
LWATCDR
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

Really? From what I have done in writing my own code I have found
hashing to be faster than a btree but then when I wrote my own hashing
it was a specific type of key.
Anyway I put in the tree indexes and I am still getting a seq scan.

Aggregate (cost=12.12..12.13 rows=1 width=0)
-> Result (cost=0.00..12.12 rows=1 width=0)
One-Time Filter: NULL::boolean
-> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)
Filter: (((issue_target)::text = 'david'::text) OR
((manager)::text = 'david'::text))


On Jan 14, 2008 11:54 AM, Sean Davis <sdavis2@mail.nih.gov> wrote:
>
>
>
> On Jan 14, 2008 11:45 AM, LWATCDR <lwatcdr@gmail.com> wrote:
> > Thanks would you suggest a btree or a hash? My guess would a hash
> > since it uses an =.
> >

>
> You can pretty much ignore hash indexes in Postgres. They are, in nearly
> every case (every case that I know of), slower than btree. Just make the
> indexes using the default indexing scheme. Again, do not forget to analyze
> the table after creating the indexes.
>
> Sean
>
>
>


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-17-2008, 10:51 PM
Daniel T. Staal
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.


On Mon, January 14, 2008 12:14 pm, LWATCDR wrote:
> Really? From what I have done in writing my own code I have found
> hashing to be faster than a btree but then when I wrote my own hashing
> it was a specific type of key.
> Anyway I put in the tree indexes and I am still getting a seq scan.
>
> Aggregate (cost=12.12..12.13 rows=1 width=0)
> -> Result (cost=0.00..12.12 rows=1 width=0)
> One-Time Filter: NULL::boolean
> -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)
> Filter: (((issue_target)::text = 'david'::text) OR
> ((manager)::text = 'david'::text))


Based on that cost, a sequence scan is probably the fastest yet: It's such
a small dataset that fetching the index and working with it before going
back and fetching the data is just overkill.

When you add a few dozen more rows or so, it'll switch to using the index.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-17-2008, 10:51 PM
Sean Davis
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

On Jan 14, 2008 12:14 PM, LWATCDR <lwatcdr@gmail.com> wrote:

> Really? From what I have done in writing my own code I have found
> hashing to be faster than a btree but then when I wrote my own hashing
> it was a specific type of key.
> Anyway I put in the tree indexes and I am still getting a seq scan.
>
> Aggregate (cost=12.12..12.13 rows=1 width=0)
> -> Result (cost=0.00..12.12 rows=1 width=0)
> One-Time Filter: NULL::boolean
> -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)
> Filter: (((issue_target)::text = 'david'::text) OR
> ((manager)::text = 'david'::text))



The Postgres planner will choose what it thinks is the fastest plan. In
this case, your table has only 1 row (?), so sequential scan will be
fastest. You will want to load data into your table before doing
benchmarking.

Sean

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-17-2008, 10:51 PM
Alan Hodgson
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

On Monday 14 January 2008, LWATCDR <lwatcdr@gmail.com> wrote:
> Really? From what I have done in writing my own code I have found
> hashing to be faster than a btree but then when I wrote my own hashing
> it was a specific type of key.
> Anyway I put in the tree indexes and I am still getting a seq scan.
>


> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)


The planner will always choose a seq scan when your table size is very
small.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-17-2008, 10:51 PM
Brian Hurt
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

LWATCDR wrote:

>Really? From what I have done in writing my own code I have found
>hashing to be faster than a btree but then when I wrote my own hashing
>it was a specific type of key.
>Anyway I put in the tree indexes and I am still getting a seq scan.
>
>Aggregate (cost=12.12..12.13 rows=1 width=0)
> -> Result (cost=0.00..12.12 rows=1 width=0)
> One-Time Filter: NULL::boolean
> -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)
> Filter: (((issue_target)::text = 'david'::text) OR
>((manager)::text = 'david'::text))
>
>
>
>

For very small tables, Postgres will skip reading the indexes, because
it's not worth it. Postgres thinks it's only going to have to read 12
pages or so. At which point it'll likely have to read all the pages
anyways, so why also read the index?

Brian


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-17-2008, 10:51 PM
LWATCDR
 
Posts: n/a
Default Re: Avoiding a seq scan on a table.

that is very odd since that table has 141 records in it.

here is a different query that I ran.
SELECT COUNT(*) FROM rma where terminatedate is NULL;
This returns a value of 254 for the count but this is what I get from explain.

Aggregate (cost=219.77..219.78 rows=1 width=0)
-> Seq Scan on rma (cost=0.00..219.11 rows=264 width=0)
Filter: (terminatedate IS NULL)
This says that rows =1 but returns 254 rows of data?
The table contains over 7000 rows.


On Jan 14, 2008 12:22 PM, Daniel T. Staal <DStaal@usa.net> wrote:
>
> On Mon, January 14, 2008 12:14 pm, LWATCDR wrote:
> > Really? From what I have done in writing my own code I have found
> > hashing to be faster than a btree but then when I wrote my own hashing
> > it was a specific type of key.
> > Anyway I put in the tree indexes and I am still getting a seq scan.
> >
> > Aggregate (cost=12.12..12.13 rows=1 width=0)
> > -> Result (cost=0.00..12.12 rows=1 width=0)
> > One-Time Filter: NULL::boolean
> > -> Seq Scan on issuetracking (cost=0.00..12.12 rows=1 width=0)
> > Filter: (((issue_target)::text = 'david'::text) OR
> > ((manager)::text = 'david'::text))

>
> Based on that cost, a sequence scan is probably the fastest yet: It's such
> a small dataset that fetching the index and working with it before going
> back and fetching the data is just overkill.
>
> When you add a few dozen more rows or so, it'll switch to using the index.
>
> Daniel T. Staal
>
> ---------------------------------------------------------------
> This email copyright the author. Unless otherwise noted, you
> are expressly allowed to retransmit, quote, or otherwise use
> the contents for non-commercial purposes. This copyright will
> expire 5 years after the author's death, or in 30 years,
> whichever is longer, unless such a period is in excess of
> local copyright law.
> ---------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:35 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com