Unix Technical Forum

Re: How to keep at-most N rows per group? periodic DELETEs orconstraints or..?

This is a discussion on Re: How to keep at-most N rows per group? periodic DELETEs orconstraints or..? within the pgsql Sql forums, part of the PostgreSQL category; --> At 07:20 AM 1/9/2008, pgsql-sql-owner@postgresql.org wrote: >Date: Tue, 8 Jan 2008 17:41:18 +0000 >From: "Jamie Tufnell" <diesql@googlemail.com> >To: pgsql-sql@postgresql.org ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Sql

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 06:00 PM
Steve Midgley
 
Posts: n/a
Default Re: How to keep at-most N rows per group? periodic DELETEs orconstraints or..?

At 07:20 AM 1/9/2008, pgsql-sql-owner@postgresql.org wrote:
>Date: Tue, 8 Jan 2008 17:41:18 +0000
>From: "Jamie Tufnell" <diesql@googlemail.com>
>To: pgsql-sql@postgresql.org
>Subject: Re: How to keep at-most N rows per group? periodic DELETEs or
>constraints or..?
>Message-ID:
><b0a4f3350801080941x5b4cccc9qbf6220ab35a0bf57@mai l.gmail.com>
>
>On 1/8/08, codeWarrior <gpatnude@hotmail.com> wrote:
> > Jamie:
> >
> > I think you are probably having slowdown issues in your "DELETE

> FROM WHERE
> > NOT IN SELECT ORDER BY DESCENDING" construct -- that seems a bit

> convoluted
> > to me....

>
>Hmm so rather than NOT IN ( .. LIMIT 50) would you suggest IN ( ...
>OFFSET 50) like in Erik's example? Or something else entirely?
>
> > ALSO: It looks to me like you have a column named "timestamp' ???

> This is
> > bad practice since "timestamp" is a reserved word... You really

> ought NOT to
> > use reserved words for column names... different debate.

>
>I do realize it would be better to use something else and thanks for
>the tip This is an established database and "timestamp" has been
>used in other tables which is why I stuck to it here.. one day when
>time permits maybe I'll rename them all!
>
> > Why bother deleting records anyway ? Why not alter your query that

> tracks
> > the 50 records to LIMIT 50 ???

>
>The read query does LIMIT 50 and the reason for deleting the rest of
>the records is because they're not needed by the application and
>there's loads of them being created all the time (currently several
>million unnecessary rows) -- I imagine eventually this will slow
>things down?
>
>Do you think a regular batch process to delete rows might be more
>appropriate than a trigger in this scenario?
>
>Thanks,
>Jamie


This is kludgy but you would have some kind of random number test at
the start of the trigger - if it evals true once per every ten calls to
the trigger (say), you'd cut your delete statements execs by about 10x
and still periodically truncate every set of user rows fairly often. On
average you'd have ~55 rows per user, never less than 50 and a few
outliers with 60 or 70 rows before they get trimmed back down to 50..
Seems more reliable than a cron job, and solves your problem of an ever
growing table? You could adjust the random number test easily if you
change your mind of the balance of size of table vs. # of delete
statements down the road.

Steve



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 06:00 PM
Scott Marlowe
 
Posts: n/a
Default Re: Re: How to keep at-most N rows per group? periodic DELETEs or constraints or..?

On Jan 9, 2008 12:20 PM, Steve Midgley <public@misuse.org> wrote:
> This is kludgy but you would have some kind of random number test at
> the start of the trigger - if it evals true once per every ten calls to
> the trigger (say), you'd cut your delete statements execs by about 10x
> and still periodically truncate every set of user rows fairly often. On
> average you'd have ~55 rows per user, never less than 50 and a few
> outliers with 60 or 70 rows before they get trimmed back down to 50..
> Seems more reliable than a cron job, and solves your problem of an ever
> growing table? You could adjust the random number test easily if you
> change your mind of the balance of size of table vs. # of delete
> statements down the road.


And, if you always through a limit 50 on the end of queries that
retrieve data, you could let it grow quite a bit more than 60 or 70...
Say 200. Then you could have it so that the random chopper function
only gets kicked off every 100th or so time.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 06:00 PM
Erik Jones
 
Posts: n/a
Default Re: Re: How to keep at-most N rows per group? periodic DELETEs or constraints or..?


On Jan 9, 2008, at 1:09 PM, Scott Marlowe wrote:

> On Jan 9, 2008 12:20 PM, Steve Midgley <public@misuse.org> wrote:
>> This is kludgy but you would have some kind of random number test at
>> the start of the trigger - if it evals true once per every ten
>> calls to
>> the trigger (say), you'd cut your delete statements execs by about
>> 10x
>> and still periodically truncate every set of user rows fairly
>> often. On
>> average you'd have ~55 rows per user, never less than 50 and a few
>> outliers with 60 or 70 rows before they get trimmed back down to 50..
>> Seems more reliable than a cron job, and solves your problem of an
>> ever
>> growing table? You could adjust the random number test easily if you
>> change your mind of the balance of size of table vs. # of delete
>> statements down the road.

>
> And, if you always through a limit 50 on the end of queries that
> retrieve data, you could let it grow quite a bit more than 60 or 70...
> Say 200. Then you could have it so that the random chopper function
> only gets kicked off every 100th or so time.


I like that idea.

Erik Jones

DBA | EmmaŽ
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 06:00 PM
Jamie Tufnell
 
Posts: n/a
Default Re: Re: How to keep at-most N rows per group? periodic DELETEs or constraints or..?

On 1/9/08, Erik Jones <erik@myemma.com> wrote:
> On Jan 9, 2008, at 1:09 PM, Scott Marlowe wrote:
> > On Jan 9, 2008 12:20 PM, Steve Midgley <public@misuse.org> wrote:
> >> This is kludgy but you would have some kind of random number test at
> >> the start of the trigger - if it evals true once per every ten
> >> calls to
> >> the trigger (say), you'd cut your delete statements execs by about
> >> 10x
> >> and still periodically truncate every set of user rows fairly
> >> often.

> >
> > And, if you always through a limit 50 on the end of queries that
> > retrieve data, you could let it grow quite a bit more than 60 or 70...
> > Say 200. Then you could have it so that the random chopper function
> > only gets kicked off every 100th or so time.

>
> I like that idea.


I do too! I'm going to have a shot at implementing this tomorrow.
Thanks for all your opinions guys :-)

Cheers,
Jamie

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:36 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com