Unix Technical Forum

Sidetracking pg_autovacuum

This is a discussion on Sidetracking pg_autovacuum within the pgsql Admins forums, part of the PostgreSQL category; --> (Environment: Pg 7.4.1 on Linux; yes, I know it's past time to upgrade.) (Originally, I started to post this ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Admins

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 01:51 AM
Jeff Boes
 
Posts: n/a
Default Sidetracking pg_autovacuum

(Environment: Pg 7.4.1 on Linux; yes, I know it's past time to upgrade.)

(Originally, I started to post this as a question about how
pg_class.reltuples could get so far out of whack compared to the actual
table size. After posting it, I found that (a) I had the mailing list
address wrong [d'oh!], and (b) I think I solved my problem. In essence,
by doing a VACUUM in a script after a sizeable DELETE, apparently I took
the affected table out of the hands of pg_autovacuum so that it never,
ever did anything with the table. Including ANALYZE. Thus, reltuples
never got updated.)

I started tracking the COUNT(*) (actual row count) versus the value in
pg_class.reltuples for a number of our larger tables. Some of the tables
see a lot of INSERTs and DELETEs over the course of a day; as much as
1/12th of the data will be deleted overnight, and new data inserted over
the course of the day. I have pg_autovacuum running, and I also do
regular VACUUMs, ANALYZEs, and even some CLUSTERs on these tables.

[N.B.: In fact, I started doing a VACUUM explicitly after the big
nightly DELETE.]

One table in particular started to act "funny", which got me looking at
the innards of it, and I found that on our production system (which has
significantly higher throughput), the ratio of (pg_class.reltuples /
count(*)) would climb to 2.0 or higher; that seemed to indicate that
pg_autovacuum wasn't doing its job, at least not when I expected it to,
so I started doing a VACUUM after the DELETEs overnight, and that made
*that* problem go away.

However, now the test system (with the lower throughput) is behaving
oddly. The ratio has *fallen* to about 0.16, meaning there are *six
times as many* actual rows in the table compared to what the optimizer
thinks. I did a hand-analyze on the table, and the situation didn't
change. Right now, row count is about 182,000 and reltuples is under 29,000.

What would cause an otherwise well-behaved table to start doing this? Is
this just a "dead spot" in the ANALYZE command? (By which I mean:
ANALYZE randomly sampling rows, but my data is not terribly random, so
it gets fooled?)

[And here's the remaining question in my puzzled mind: ANALYZE would not
change the reltuples value, but VACUUM FULL ANALYZE did. Er-wha?]


--
Jeff Boes Vox 269-226-9550 x24
Director of Software Development Fax 269-349-9076

Exfacto! Exceptional Online Content http://www.exfacto.com
Nexcerpt ...Extend Your Expertise... http://www.nexcerpt.com



---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 01:51 AM
Tom Lane
 
Posts: n/a
Default Re: Sidetracking pg_autovacuum

Jeff Boes <jboes@nexcerpt.com> writes:
> What would cause an otherwise well-behaved table to start doing this? Is
> this just a "dead spot" in the ANALYZE command? (By which I mean:
> ANALYZE randomly sampling rows, but my data is not terribly random, so
> it gets fooled?)


7.4's ANALYZE is indeed easily fooled by nonuniform tuple density.
8.0 uses a different sampling algorithm that we hope will do better.

> [And here's the remaining question in my puzzled mind: ANALYZE would not
> change the reltuples value, but VACUUM FULL ANALYZE did. Er-wha?]


Any variant of VACUUM records the exact number of tuples that it found
in its full table scan in reltuples. It's only a standalone ANALYZE
that has to approximate.

The case where you saw reltuples > count(*) probably came from the fact
that VACUUM counts physical tuples --- that is, the difference reflects
dead-but-not-yet-removable tuples.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-10-2008, 01:51 AM
Chris Browne
 
Posts: n/a
Default Re: Sidetracking pg_autovacuum

jboes@nexcerpt.com (Jeff Boes) writes:
> (Environment: Pg 7.4.1 on Linux; yes, I know it's past time to
> upgrade.)


If you were running 7.2, I'd be pestering you ;-).

We didn't get rid of our last 7.1 system until last December, and I
think the backend is probably still running, albeit with nothing
hooking up to it :-).

> (Originally, I started to post this as a question about how
> pg_class.reltuples could get so far out of whack compared to the
> actual table size. After posting it, I found that (a) I had the
> mailing list address wrong [d'oh!], and (b) I think I solved my
> problem. In essence, by doing a VACUUM in a script after a sizeable
> DELETE, apparently I took the affected table out of the hands of
> pg_autovacuum so that it never, ever did anything with the
> table. Including ANALYZE. Thus, reltuples never got updated.)


Oops. Change that to a VACUUM ANALYZE (in your script), and things
should work out better.

> I started tracking the COUNT(*) (actual row count) versus the value
> in pg_class.reltuples for a number of our larger tables. Some of the
> tables see a lot of INSERTs and DELETEs over the course of a day; as
> much as 1/12th of the data will be deleted overnight, and new data
> inserted over the course of the day. I have pg_autovacuum running,
> and I also do regular VACUUMs, ANALYZEs, and even some CLUSTERs on
> these tables.
>
> [N.B.: In fact, I started doing a VACUUM explicitly after the big
> nightly DELETE.]


That's not a bad idea, all in all.

> What would cause an otherwise well-behaved table to start doing
> this? Is this just a "dead spot" in the ANALYZE command? (By which I
> mean: ANALYZE randomly sampling rows, but my data is not terribly
> random, so it gets fooled?)


Yes, that doubtless would be it. There's a new sampling scheme in
version 8 that may do better.

> [And here's the remaining question in my puzzled mind: ANALYZE would
> not change the reltuples value, but VACUUM FULL ANALYZE
> did. Er-wha?]


VACUUM ANALYZE would also change the reltuples value...
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sap.html
Rules of the Evil Overlord #78. "I will not tell my Legions of Terror
"And he must be taken alive!" The command will be: ``And try to take
him alive if it is reasonably practical.''"
<http://www.eviloverlord.com/>
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:29 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com