Unix Technical Forum

PostgreSQL using the wrong Index

This is a discussion on PostgreSQL using the wrong Index within the Pgsql Performance forums, part of the PostgreSQL category; --> We have two index's like so l1_historical=# \d "N_intra_time_idx" Index "N_intra_time_idx" Column | Type --------+----------------------------- time | timestamp without ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 11:53 AM
Alex Stapleton
 
Posts: n/a
Default PostgreSQL using the wrong Index

We have two index's like so

l1_historical=# \d "N_intra_time_idx"
Index "N_intra_time_idx"
Column | Type
--------+-----------------------------
time | timestamp without time zone
btree


l1_historical=# \d "N_intra_pkey"
Index "N_intra_pkey"
Column | Type
--------+-----------------------------
symbol | text
time | timestamp without time zone
unique btree (primary key)

and on queries like this

select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
order by time desc limit 1;

PostgreSQL takes a very long time to complete, as it effectively
scans the entire table, backwards. And the table is huge, about 450
million rows. (btw, there are no triggers or any other exciting
things like that on our tables in this db.)

but on things where the symbol does exist in the table, it's more or
less fine, and nice and fast.

Whilst the option the planner has taken might be faster most of the
time, the worst case scenario is unacceptable for obvious reasons.
I've googled for trying to force the use of a specific index, but
can't find anything relevant. Does anyone have any suggestions on
getting it to use an index which hopefully will have better worst
case performance?

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 11:53 AM
Alex Stapleton
 
Posts: n/a
Default Re: PostgreSQL using the wrong Index

Oh, we are running 7.4.2 btw. And our random_page_cost = 1

On 13 Jun 2005, at 14:02, Alex Stapleton wrote:

> We have two index's like so
>
> l1_historical=# \d "N_intra_time_idx"
> Index "N_intra_time_idx"
> Column | Type
> --------+-----------------------------
> time | timestamp without time zone
> btree
>
>
> l1_historical=# \d "N_intra_pkey"
> Index "N_intra_pkey"
> Column | Type
> --------+-----------------------------
> symbol | text
> time | timestamp without time zone
> unique btree (primary key)
>
> and on queries like this
>
> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
> order by time desc limit 1;
>
> PostgreSQL takes a very long time to complete, as it effectively
> scans the entire table, backwards. And the table is huge, about 450
> million rows. (btw, there are no triggers or any other exciting
> things like that on our tables in this db.)
>
> but on things where the symbol does exist in the table, it's more
> or less fine, and nice and fast.
>
> Whilst the option the planner has taken might be faster most of the
> time, the worst case scenario is unacceptable for obvious reasons.
> I've googled for trying to force the use of a specific index, but
> can't find anything relevant. Does anyone have any suggestions on
> getting it to use an index which hopefully will have better worst
> case performance?
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>
>



---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 11:53 AM
Tom Lane
 
Posts: n/a
Default Re: PostgreSQL using the wrong Index

Alex Stapleton <alexs@advfn.com> writes:
> l1_historical=# \d "N_intra_pkey"
> Index "N_intra_pkey"
> Column | Type
> --------+-----------------------------
> symbol | text
> time | timestamp without time zone
> unique btree (primary key)


> and on queries like this


> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
> order by time desc limit 1;


This was just covered in excruciating detail yesterday ...

You need to write
order by symbol desc, time desc limit 1
to get the planner to recognize the connection to the sort order
of this index. Since you're only selecting one value of symbol,
the actual output doesn't change.

> Oh, we are running 7.4.2 btw. And our random_page_cost = 1


I'll bet lunch that that is a bad selection of random_page_cost,
unless your database is so small that it all fits in RAM.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 11:54 AM
John A Meinel
 
Posts: n/a
Default Re: PostgreSQL using the wrong Index

Alex Stapleton wrote:

> Oh, we are running 7.4.2 btw. And our random_page_cost = 1
>

Which is only correct if your entire db fits into memory. Also, try
updating to a later 7.4 version if at all possible.

> On 13 Jun 2005, at 14:02, Alex Stapleton wrote:
>
>> We have two index's like so
>>
>> l1_historical=# \d "N_intra_time_idx"
>> Index "N_intra_time_idx"
>> Column | Type
>> --------+-----------------------------
>> time | timestamp without time zone
>> btree
>>

Just so you are aware, writing this as: "We have an index on
N_intra(time) and one on N_Intra(symbol, time)" is a lot more succinct.

>>
>> l1_historical=# \d "N_intra_pkey"
>> Index "N_intra_pkey"
>> Column | Type
>> --------+-----------------------------
>> symbol | text
>> time | timestamp without time zone
>> unique btree (primary key)
>>
>> and on queries like this
>>
>> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
>> order by time desc limit 1;
>>
>> PostgreSQL takes a very long time to complete, as it effectively
>> scans the entire table, backwards. And the table is huge, about 450
>> million rows. (btw, there are no triggers or any other exciting
>> things like that on our tables in this db.)
>>
>> but on things where the symbol does exist in the table, it's more or
>> less fine, and nice and fast.

>

What happens if you do:
SELECT * FROM "N_intra" WHERE symbol='doesnt exist' ORDER BY symbol,
time DESC LIMIT 1;

Yes, symbol is constant, but it frequently helps the planner realize it
can use an index scan if you include all terms in the index in the ORDER
BY clause.

>>
>> Whilst the option the planner has taken might be faster most of the
>> time, the worst case scenario is unacceptable for obvious reasons.
>> I've googled for trying to force the use of a specific index, but
>> can't find anything relevant. Does anyone have any suggestions on
>> getting it to use an index which hopefully will have better worst
>> case performance?

>

Try the above first. You could also create a new index on symbol
CREATE INDEX "N_intra_symbol_idx" ON "N_intra"(symbol);

Then the WHERE clause should use the symbol index, which means it can
know quickly that an entry doesn't exist. I'm not sure how many entries
you have per symbol, though, so this might cause problems in the ORDER
BY time portion.

I'm guessing what you really want is to just do the ORDER BY symbol, time.

John
=:->


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD4DBQFCrZxxJdeBCYSNAAMRArsiAJ4lhNxy7mnwvZkCyQW0Jx zXRu7YPACWIYKo
LXNvDF3+FCwWeHOJn/cbcg==
=Ya71
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 11:54 AM
Alex Stapleton
 
Posts: n/a
Default Re: PostgreSQL using the wrong Index


On 13 Jun 2005, at 15:47, John A Meinel wrote:

> Alex Stapleton wrote:
>
>
>> Oh, we are running 7.4.2 btw. And our random_page_cost = 1
>>
>>

> Which is only correct if your entire db fits into memory. Also, try
> updating to a later 7.4 version if at all possible.
>


I am aware of this, I didn't configure this machine though
unfortuantely.

>> On 13 Jun 2005, at 14:02, Alex Stapleton wrote:
>>
>>
>>> We have two index's like so
>>>
>>> l1_historical=# \d "N_intra_time_idx"
>>> Index "N_intra_time_idx"
>>> Column | Type
>>> --------+-----------------------------
>>> time | timestamp without time zone
>>> btree
>>>
>>>

> Just so you are aware, writing this as: "We have an index on
> N_intra(time) and one on N_Intra(symbol, time)" is a lot more
> succinct.
>


Sorry, I happened to have them there in my clipboard at the time so I
just blindly pasted them in.

>>>
>>> l1_historical=# \d "N_intra_pkey"
>>> Index "N_intra_pkey"
>>> Column | Type
>>> --------+-----------------------------
>>> symbol | text
>>> time | timestamp without time zone
>>> unique btree (primary key)
>>>
>>> and on queries like this
>>>
>>> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
>>> order by time desc limit 1;
>>>
>>> PostgreSQL takes a very long time to complete, as it effectively
>>> scans the entire table, backwards. And the table is huge, about 450
>>> million rows. (btw, there are no triggers or any other exciting
>>> things like that on our tables in this db.)
>>>
>>> but on things where the symbol does exist in the table, it's
>>> more or
>>> less fine, and nice and fast.
>>>

>>
>>

> What happens if you do:
> SELECT * FROM "N_intra" WHERE symbol='doesnt exist' ORDER BY symbol,
> time DESC LIMIT 1;


Hurrah! I should of thought of this, considering i've done it in the
past Thanks a lot, that's great.

> Yes, symbol is constant, but it frequently helps the planner
> realize it
> can use an index scan if you include all terms in the index in the
> ORDER
> BY clause.




>
>>>
>>> Whilst the option the planner has taken might be faster most of the
>>> time, the worst case scenario is unacceptable for obvious reasons.
>>> I've googled for trying to force the use of a specific index, but
>>> can't find anything relevant. Does anyone have any suggestions on
>>> getting it to use an index which hopefully will have better worst
>>> case performance?
>>>

>>
>>

> Try the above first. You could also create a new index on symbol
> CREATE INDEX "N_intra_symbol_idx" ON "N_intra"(symbol);
>
> Then the WHERE clause should use the symbol index, which means it can
> know quickly that an entry doesn't exist. I'm not sure how many
> entries
> you have per symbol, though, so this might cause problems in the ORDER
> BY time portion.
>
> I'm guessing what you really want is to just do the ORDER BY
> symbol, time.
>
> John
> =:->
>
>



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 11:54 AM
Wei Weng
 
Posts: n/a
Default Re: PostgreSQL using the wrong Index

Tom Lane wrote:
>
>
> This was just covered in excruciating detail yesterday ...
>
> You need to write
> order by symbol desc, time desc limit 1
> to get the planner to recognize the connection to the sort order
> of this index. Since you're only selecting one value of symbol,
> the actual output doesn't change.
>

Is this the right behavior (not a bug)? Is postgresql planning on changing
this soon?


Thanks

Wei

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:55 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com