Unix Technical Forum

Performance delay

This is a discussion on Performance delay within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, just want to share with all of you a wierd thing that i found when i tested it. ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 10:52 AM
Hasnul Fadhly bin Hasan
 
Posts: n/a
Default Performance delay

Hi,

just want to share with all of you a wierd thing that i found when i
tested it.

i was doing a query that will call a function long2ip to convert bigint
to ips.

so the query looks something like this.

select id, long2ip(srcip), long2ip(dstip) from sometable
where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
23:59' order by id limit 30;

for your info, there are about 300k rows for that timeframe.

it cost me about 57+ secs to get the list.

which is about the same if i query
select id, long2ip(srcip), long2ip(dstip) from sometable
where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
23:59'

it will cost me about 57+ secs also.

Now if i did this
select id,long2ip(srcip), long2ip(dstip) from (
* from sometable
where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
23:59' order by id limit 30) as t;

it will cost me about 3+ secs

Anyone knows why this is the case?

Hasnul





---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 10:52 AM
Richard Huxton
 
Posts: n/a
Default Re: Performance delay

Hasnul Fadhly bin Hasan wrote:
> Hi,
>
> just want to share with all of you a wierd thing that i found when i
> tested it.
>
> i was doing a query that will call a function long2ip to convert bigint
> to ips.
>
> so the query looks something like this.
>
> select id, long2ip(srcip), long2ip(dstip) from sometable
> where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
> 23:59' order by id limit 30;
>
> for your info, there are about 300k rows for that timeframe.
>
> it cost me about 57+ secs to get the list.
>
> which is about the same if i query
> select id, long2ip(srcip), long2ip(dstip) from sometable
> where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
> 23:59'
>
> it will cost me about 57+ secs also.
>
> Now if i did this
> select id,long2ip(srcip), long2ip(dstip) from (
> * from sometable
> where timestamp between timestamp '01-10-2005' and timestamp '01-10-2005
> 23:59' order by id limit 30) as t;
>
> it will cost me about 3+ secs


The difference will be that in the final case you only make 30 calls to
long2ip() whereas in the first two you call it 300,000 times and then
throw away most of them.
Try running EXPLAIN ANALYSE ... for both - that will show how PG is
planning the query.
--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 10:52 AM
Hasnul Fadhly bin Hasan
 
Posts: n/a
Default Re: Performance delay

Hi Richard,

Thanks for the reply.. is that the case? i thought it would comply to
the where condition first..
and after that it will format the output to what we want..

Hasnul

Richard Huxton wrote:

> Hasnul Fadhly bin Hasan wrote:
>
>> Hi,
>>
>> just want to share with all of you a wierd thing that i found when i
>> tested it.
>>
>> i was doing a query that will call a function long2ip to convert
>> bigint to ips.
>>
>> so the query looks something like this.
>>
>> select id, long2ip(srcip), long2ip(dstip) from sometable
>> where timestamp between timestamp '01-10-2005' and timestamp
>> '01-10-2005 23:59' order by id limit 30;
>>
>> for your info, there are about 300k rows for that timeframe.
>>
>> it cost me about 57+ secs to get the list.
>>
>> which is about the same if i query
>> select id, long2ip(srcip), long2ip(dstip) from sometable
>> where timestamp between timestamp '01-10-2005' and timestamp
>> '01-10-2005 23:59'
>>
>> it will cost me about 57+ secs also.
>>
>> Now if i did this
>> select id,long2ip(srcip), long2ip(dstip) from (
>> * from sometable
>> where timestamp between timestamp '01-10-2005' and timestamp
>> '01-10-2005 23:59' order by id limit 30) as t;
>>
>> it will cost me about 3+ secs

>
>
> The difference will be that in the final case you only make 30 calls
> to long2ip() whereas in the first two you call it 300,000 times and
> then throw away most of them.
> Try running EXPLAIN ANALYSE ... for both - that will show how PG is
> planning the query.
> --
> Richard Huxton
> Archonet Ltd
>
>



---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 10:52 AM
PFC
 
Posts: n/a
Default MOVE command


Hello,

Here I'm implementing a session management, which has a connections table
partitioned between active and archived connections. A connection
represents a connection between a user and a chatroom.

I use partitioning for performance reasons.

The active table contains all the data for the active session : user_id,
chatroom_id, session start time, and other information.
The archive table contains just the user_id, chatroom_id, session start
and end time, for logging purposes, and for displaying on the site, which
user was logged to which chatroom and from when to when.

Thus, when a user disconnects from a chatroom, I must move one row from
the active to the archive table. This poses no problem as there is a
UNIQUE index (iser_id,chatroom_id) so I select the row FOR UPDATE, insert
it in the archive table, then delete it.

Now, when a user logs out from the site, or when his session is purged by
the auto-expiration cron job, I must also expire ALL his open chatroom
connections.
INSERT INTO archive (...) SELECT ... FROM active WHERE user_id = ...;
DELETE FROM active WHERE user_id = ...;

Now, if the user inserts a connection between the two queries above, the
thing will fail (the connection will just be deleted). I know that there
are many ways to do it right :
- LOCK the table in exclusive mode
- use an additional primary key on the active table which is not related
to the user_id and the chatroom_id, select the id's of the sessions to
expire in a temporary table, and use that
- use an extra field in the table to mark that the rows are being
processed
- use transaction isolation level SERIALIZABLE

However, all these methods somehow don't feel right, and as this is an
often encountered problem, I'd really like to have a sql command, say
MOVE, or SELECT AND DELETE, whatever, which acts like a SELECT, returning
the rows, but deleting them as well. Then I'd just do INSERT INTO archive
(...) SELECT ... AND DELETE FROM active WHERE user_id = ...;

which would have the following advantages :
- No worries about locks :
- less chance of bugs
- higher performance because locks have to be waited on, by definition
- No need to do the request twice (so, it is twice as fast !)
- Simplicity and elegance

There would be an hidden bonus, that if you acquire locks, you better
COMMIT the transaction as soon as possible to release them, whereas here,
you can happily continue in the transaction.

I think this command would make a nice cousin to the also very popular
INSERT... OR UPDATE which tries to insert a row, and if it exists, UPDATES
it instead of inserting it !

What do you think ?






---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 10:53 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Performance delay

On Thu, Jan 13, 2005 at 07:14:10PM +0800, Hasnul Fadhly bin Hasan wrote:
> Hi Richard,
>
> Thanks for the reply.. is that the case? i thought it would comply to
> the where condition first..
> and after that it will format the output to what we want..


That is in fact exactly what it's doing. The second query is faster not
because of the where clause, but because of the limit clause. The first
query builds a list of id, long2ip(srcip), long2ip(dstip) for the
timestamp range, then it orders that list and gives you the first 30.
The second query builds a list of everything from sometable for the
timestamp range, orders it, keeps the first 30, THEN in calculates
long2ip based on that list of 30 items.

> Hasnul
>
> Richard Huxton wrote:
>
> >Hasnul Fadhly bin Hasan wrote:
> >
> >>Hi,
> >>
> >>just want to share with all of you a wierd thing that i found when i
> >>tested it.
> >>
> >>i was doing a query that will call a function long2ip to convert
> >>bigint to ips.
> >>
> >>so the query looks something like this.
> >>
> >>select id, long2ip(srcip), long2ip(dstip) from sometable
> >>where timestamp between timestamp '01-10-2005' and timestamp
> >>'01-10-2005 23:59' order by id limit 30;
> >>
> >>for your info, there are about 300k rows for that timeframe.
> >>
> >>it cost me about 57+ secs to get the list.
> >>
> >>which is about the same if i query
> >>select id, long2ip(srcip), long2ip(dstip) from sometable
> >>where timestamp between timestamp '01-10-2005' and timestamp
> >>'01-10-2005 23:59'
> >>
> >>it will cost me about 57+ secs also.
> >>
> >>Now if i did this
> >>select id,long2ip(srcip), long2ip(dstip) from (
> >>* from sometable
> >>where timestamp between timestamp '01-10-2005' and timestamp
> >>'01-10-2005 23:59' order by id limit 30) as t;
> >>
> >>it will cost me about 3+ secs

> >
> >
> >The difference will be that in the final case you only make 30 calls
> >to long2ip() whereas in the first two you call it 300,000 times and
> >then throw away most of them.
> >Try running EXPLAIN ANALYSE ... for both - that will show how PG is
> >planning the query.
> >--
> > Richard Huxton
> > Archonet Ltd
> >
> >

>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:54 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com