This is a discussion on Re: Vacuum Full Analyze Stalled within the pgsql Admins forums, part of the PostgreSQL category; --> We will use gdb and strace the next time we see this. I've tried to be specific about which ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| We will use gdb and strace the next time we see this. I've tried to be specific about which vacuum is running in all cases. If the posts have been confusing on that issue, I apologize. I'll try to be clear on this in future posts. To summarize past events, the case involving the constraint index was indeed a "vacuum full" of the entire database under heavy load. Autovacuum failed to keep the small, high-update table clean in that scenario, but I am not sure whether that caused the failure of the vacuum full, or was the result of it. This weekend, it seemed like the first thing which failed (and the last) were autovacuum attempts. Vacuum full was run through psql during attempts to recover performance after the failure of autovacuum caused performance to slow noticably. We didn't capture info which would tell us whether the explicit vacuum was blocked by an autovacuum process. There were a few very small single-source tests under 8.0.3, but all tests involving any significant load were under 8.1beta1 or 8.1beta2. We did not see this in any of those small tests under 8.0.3. -Kevin >>> Tom Lane <tgl@sss.pgh.pa.us> 10/03/05 3:48 PM >>> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > However, I'm looking at the autovacuum code to see why it's sitting > holding locks on the small table and not vacuuming it. I see on the > pg_locks output that process 3158 (autovacuum) has got locks on the > table and index, but it apparently isn't vacuuming the table. If this > is correct, it's a bug. However I can't seem to find out why this > happens. We can see clearly from the pg_locks output that VACUUM isn't waiting for an lmgr lock, so the problem must be at a lower level. The hypothesis I'm thinking about is that VACUUM is trying to do LockBufferForCleanup() and for some reason it never finishes. There are a number of possible scenarios that could explain this: leaked buffer pin, dropped signal, etc. > Kevin, Jeff, next time this happens please attach gdb to the autovacuum > process and get a stack trace ("bt" to gdb), if at all possible, and/or > strace it to see what it's doing. Please! Also, we need to keep a little clarity about what we are dealing with. This thread has mentioned hangups in both plain vacuum (autovacuum) and VACUUM FULL. It seems very likely to me that there are different mechanisms involved --- since VACUUM FULL takes an exclusive lock on the whole table, that eliminates an entire class of possible explanations for the plain-VACUUM case, while introducing a whole new set of explanations having to do with the VACUUM being queued up behind ordinary table locks. Please be perfectly clear about which scenario each report is about. Finally, I'm wondering whether this bug is new in 8.1 or is pre-existing. Has this same application been running successfully in 8.0? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| On Mon, Oct 03, 2005 at 04:37:17PM -0500, Kevin Grittner wrote: > We will use gdb and strace the next time we see this. > > I've tried to be specific about which vacuum is running in all cases. If > the posts have been confusing on that issue, I apologize. I'll try to be > clear on this in future posts. > > To summarize past events, the case involving the constraint index > was indeed a "vacuum full" of the entire database under heavy load. > Autovacuum failed to keep the small, high-update table clean in that > scenario, but I am not sure whether that caused the failure of the > vacuum full, or was the result of it. This weekend, it seemed like the > first thing which failed (and the last) were autovacuum attempts. > Vacuum full was run through psql during attempts to recover > performance after the failure of autovacuum caused performance > to slow noticably. We didn't capture info which would tell us whether > the explicit vacuum was blocked by an autovacuum process. Keep in mind that vacuum full is *very* aggressive for use in a production environment. It aquires exclusive locks on tables, which means everything else will grind to a complete halt while it's running. Unless you have a very specific reason to use vacuum full, you should just use plain vacuum (not related to autovacuum). If you are going to vacuum full, you should consider using the cluster command which has some added benefits. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |