Unix Technical Forum

Re: [HACKERS] Autovacuum loose ends

This is a discussion on Re: [HACKERS] Autovacuum loose ends within the Pgsql Patches forums, part of the PostgreSQL category; --> On Thu, Jul 14, 2005 at 10:52:56AM -0400, Tom Lane wrote: > I've applied Alvaro's latest integrated-autovacuum patch. There ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Patches

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 12:43 AM
Alvaro Herrera
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

On Thu, Jul 14, 2005 at 10:52:56AM -0400, Tom Lane wrote:
> I've applied Alvaro's latest integrated-autovacuum patch. There are
> still a number of loose ends to be dealt with before beta, though:


Ok, here's a patch that deals with some of this:

- The stat collector is modified so as to keep shared relations separate
from regular ones. Also, backends sends messages separately.
Autovacuum takes advantage of this, so it correctly identifies the
appropiate time to operate on a shared relation, irrespective of the
database where they were modified. Note however that it uses each
database's pg_autovacuum settings. This means it could be vacuumed
sooner in one database than another, but I don't think it's a problem.

- Temp tables are completely ignored.

- pg_statistic is completely ignored.

- databases with no stat entry are still ignored, except that they are
checked for Xid wraparound like any other. The "oldest" one is chosen
for vacuum in a particular autovacuum run.

- A database-wide vacuum forces a pg_database flat-file update, so that
the wraparound check actually works.

- The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
order to be able to pick naptimes smaller than 60 seconds. In order
not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
var.



Some comments:

- Now that we have a real Xid wraparound check, we could go back to
having any table with no stat entry ignored, which was the original
coding. There's no danger of wraparound, and there'd be no work done
to a table that doesn't have any activity. We have to consider what
happens at stat reset -- AFAICS there's no problem, because as soon as
the table sees some activity, it will be picked up by pgstat.
However, it would be bad if stats are reset right after some heavy
activity on a table. Maybe the only thing we need is documentation.

- datallowcon is still ignored. Now it's safe to do so, because we have
a real Xid wraparound check. Changing it requires extending the
pg_database flat-file (should be fairly easy).

- There are stat messages emitted for a database-wide vacuum, just like
any other. This means that all tables in the database would end up in
pgstat; and also all databases, including those with datallowconn = false.
This may not be good. I'm not sure what exactly to do about it. Do
we want to disallow such stats? Disable message sending (or
collecting) in some circumstances?

- I haven't done anything yet w.r.t. the custom vacuum_delay nor sleep
scale factor.

- There are still no docs.


--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Porque Kim no hacia nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 12:43 AM
Tom Lane
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> - pg_statistic is completely ignored.


.... pg_statistic still needs vacuuming, surely. It's only ANALYZE
that you can/should skip for it.

> - The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
> order to be able to pick naptimes smaller than 60 seconds. In order
> not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
> var.


Hmm, I wonder whether the minimum shouldn't be 10. Or even 60.

> - Now that we have a real Xid wraparound check, we could go back to
> having any table with no stat entry ignored, which was the original
> coding. There's no danger of wraparound, and there'd be no work done
> to a table that doesn't have any activity.


Agreed.

> We have to consider what
> happens at stat reset -- AFAICS there's no problem, because as soon as
> the table sees some activity, it will be picked up by pgstat.
> However, it would be bad if stats are reset right after some heavy
> activity on a table. Maybe the only thing we need is documentation.


What's the use-case for having the stat reset feature at all?

> - datallowcon is still ignored. Now it's safe to do so, because we have
> a real Xid wraparound check. Changing it requires extending the
> pg_database flat-file (should be fairly easy).


I think this is all right, as long as a database that shows no stats
traffic is only connected to when it needs to be vacuumed for XID wrap
prevention purposes.

> - There are stat messages emitted for a database-wide vacuum, just like
> any other. This means that all tables in the database would end up in
> pgstat; and also all databases, including those with datallowconn = false.
> This may not be good. I'm not sure what exactly to do about it. Do
> we want to disallow such stats? Disable message sending (or
> collecting) in some circumstances?


Needs thought...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 12:43 AM
Alvaro Herrera
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

On Sun, Jul 24, 2005 at 02:33:38PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > - pg_statistic is completely ignored.

>
> ... pg_statistic still needs vacuuming, surely. It's only ANALYZE
> that you can/should skip for it.


Sorry, yes, it's ignored only for analyze.

> > - The postmaster's main loop sleeps Min(60, autovacuum_naptime), in
> > order to be able to pick naptimes smaller than 60 seconds. In order
> > not to make the loop a busy-wait, I forced a minimum of 1 to that GUC
> > var.

>
> Hmm, I wonder whether the minimum shouldn't be 10. Or even 60.


It's ok with me. What do other people think?

> > We have to consider what
> > happens at stat reset -- AFAICS there's no problem, because as soon as
> > the table sees some activity, it will be picked up by pgstat.
> > However, it would be bad if stats are reset right after some heavy
> > activity on a table. Maybe the only thing we need is documentation.

>
> What's the use-case for having the stat reset feature at all?


I don't know. Maybe the people who added it can tell?


> > - There are stat messages emitted for a database-wide vacuum, just like
> > any other. This means that all tables in the database would end up in
> > pgstat; and also all databases, including those with datallowconn = false.
> > This may not be good. I'm not sure what exactly to do about it. Do
> > we want to disallow such stats? Disable message sending (or
> > collecting) in some circumstances?

>
> Needs thought...


Ok.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"I call it GNU/Linux. Except the GNU/ is silent." (Ben Reiter)

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 12:43 AM
Christopher Kings-Lynne
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

>> We have to consider what
>> happens at stat reset -- AFAICS there's no problem, because as soon as
>> the table sees some activity, it will be picked up by pgstat.
>> However, it would be bad if stats are reset right after some heavy
>> activity on a table. Maybe the only thing we need is documentation.

>
>
> What's the use-case for having the stat reset feature at all?


I believe I was the root cause of the pg_stat_reset() function. The
idea at the time was that if you decide to do a round of index
optimisation, you want to be able to search for unused indexes and
heavily seq. scanned tables.

If you reset the stats you have 'clean' data to work with. For
instance, you can get 24 hours of clean stats data.

Chris


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 12:43 AM
Matthew T. O'Connor
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

Alvaro Herrera wrote:

>On Sun, Jul 24, 2005 at 02:33:38PM -0400, Tom Lane wrote:
>
>
>>Hmm, I wonder whether the minimum shouldn't be 10. Or even 60.
>>
>>

>
>It's ok with me. What do other people think?
>
>


Effectiely, this is going to be the minimum amount of "down time" for
autovacuum between checking databases, right? So if the minimum is 10
seconds, and there I have six databases, then it will check each
database at most once per minute? If so, then I'm not sure what I think
if I have a few hundred databases, 10s might be too long.

>>What's the use-case for having the stat reset feature at all?
>>
>>

>
>I don't know. Maybe the people who added it can tell?
>
>


I don't know either, but this brings up another question. Stats
wraparound. The n_tup_ins/upd/del columns in the stats system are
defined as bigint, what happens when the total number of upd for example
exceeds the capacity for bigint, or overflows to negative, anyone have
any idea?

Matt


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 12:43 AM
Tom Lane
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

"Matthew T. O'Connor" <matthew@zeut.net> writes:
> I don't know either, but this brings up another question. Stats
> wraparound. The n_tup_ins/upd/del columns in the stats system are
> defined as bigint, what happens when the total number of upd for example
> exceeds the capacity for bigint, or overflows to negative, anyone have
> any idea?


We'll all be safely dead, for one thing ;-)

At one update per nanosecond, it'd take approximately 300 years to wrap
a 64-bit counter. Somehow I don't have a problem with the idea that
Postgres would need to be rebooted that often. We'd want to fix the
32-bit nature of XIDs long before 64-bit stats counters get to be a
real-world issue ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-18-2008, 12:44 AM
Matthew T. O'Connor
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

Tom Lane wrote:

>"Matthew T. O'Connor" <matthew@zeut.net> writes:
>
>
>>I don't know either, but this brings up another question. Stats
>>wraparound.
>>

>We'll all be safely dead, for one thing ;-)
>
>At one update per nanosecond, it'd take approximately 300 years to wrap
>a 64-bit counter. Somehow I don't have a problem with the idea that
>Postgres would need to be rebooted that often. We'd want to fix the
>32-bit nature of XIDs long before 64-bit stats counters get to be a
>real-world issue ...
>


*sigh* Sorry, I should have done a little math before I asked that
question.....

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-18-2008, 12:44 AM
Alvaro Herrera
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

On Mon, Jul 25, 2005 at 09:31:15AM +0800, Christopher Kings-Lynne wrote:
> >> We have to consider what
> >> happens at stat reset -- AFAICS there's no problem, because as soon as
> >> the table sees some activity, it will be picked up by pgstat.
> >> However, it would be bad if stats are reset right after some heavy
> >> activity on a table. Maybe the only thing we need is documentation.

> >
> >What's the use-case for having the stat reset feature at all?

>
> I believe I was the root cause of the pg_stat_reset() function. The
> idea at the time was that if you decide to do a round of index
> optimisation, you want to be able to search for unused indexes and
> heavily seq. scanned tables.
>
> If you reset the stats you have 'clean' data to work with. For
> instance, you can get 24 hours of clean stats data.


Ok, so there's a reason for having a manual stat-reset. However what's
the rationale for cleaning stats at postmaster start? In fact I think
it's actively bad because you lose any data you had before postmaster
stop/crash.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"I personally became interested in Linux while I was dating an English major
who wouldn't know an operating system if it walked up and bit him."
(Val Henson)

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-18-2008, 12:44 AM
Tom Lane
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Ok, so there's a reason for having a manual stat-reset. However what's
> the rationale for cleaning stats at postmaster start? In fact I think
> it's actively bad because you lose any data you had before postmaster
> stop/crash.


We probably *should* drop the stats file if any WAL replay activity
occurs, because the stats file could be out of sync with reality
--- this is particularly important in a PITR recovery situation,
where the stats file is likely to be WAY out of sync. (Maybe only
clobber it in PITR mode?)

I agree that the argument for doing it in a normal restart is pretty
weak.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-18-2008, 12:49 AM
Mark Wong
 
Posts: n/a
Default Re: [HACKERS] Autovacuum loose ends

I thought I'd run a couple of tests to see if it would be helpful
against CVS from Aug 3, 2005.

Here's a run with autovacuum turned off:
http://www.testing.osdl.org/projects...s/dev4-015/42/
5186.55 notpm

Autvacuum on with default settings:
http://www.testing.osdl.org/projects...s/dev4-015/38/
5462.23 notpm

Would it help more to try a series of parameter changes?

Mark

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:05 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com