Unix Technical Forum

Vacuum full - disk space eaten by WAL logfiles

This is a discussion on Vacuum full - disk space eaten by WAL logfiles within the pgsql Admins forums, part of the PostgreSQL category; --> Hi all, When we do weekly "vacuum full", PG uses all space and causes PG down. checkpoint_segments | 30 ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Admins

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 01:15 AM
Lee Wu
 
Posts: n/a
Default Vacuum full - disk space eaten by WAL logfiles

Hi all,



When we do weekly "vacuum full", PG uses all space and causes PG down.



checkpoint_segments | 30

checkpoint_timeout | 300



select version();

version

-------------------------------------------------------------

PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96



The error message is:

Jan 8 20:25:33 mybox postgres[1602]: [13] PANIC: ZeroFill failed to
write /my/pg_xlog/xlogtemp.1602: No space left on device

Jan 8 20:25:35 mybox postgres[8213]: [163] LOG: removing transaction
log file 00001AB2000000EC

Jan 8 20:25:35 mybox postgres[1602]: [14-1] LOG: statement: COPY xxxxx
(domain, domain_id, customer_id, action_unspecified, action_unknown,

Jan 8 20:25:35 mybox postgres[8213]: [164] LOG: removing transaction
log file 00001AB2000000ED

Jan 8 20:25:35 mybox postgres[8213]: [165] LOG: removing transaction
log file 00001AB2000000EE

Jan 8 20:25:35 mybox postgres[1602]: [14-2] action_none, action_deny,
action_fail, action_strip, action_tag, action_quarantine, action_clean,
action_copy, action_allow,

Jan 8 20:25:35 mybox postgres[8213]: [166] LOG: removing transaction
log file 00001AB2000000F0

Jan 8 20:25:35 mybox postgres[1602]: [14-3] module, period, created)
FROM stdin

Jan 8 20:25:35 mybox postgres[8213]: [167] LOG: removing transaction
log file 00001AB2000000F1

Jan 8 20:25:35 mybox postgres[8213]: [168] LOG: removing transaction
log file 00001AB2000000F2

Jan 8 20:25:36 mybox postgres[8213]: [169] LOG: removing transaction
log file 00001AB2000000F3

Jan 8 20:25:36 mybox postgres[8213]: [170] LOG: removing transaction
log file 00001AB2000000F4

Jan 8 20:25:36 mybox postgres[8213]: [171] LOG: removing transaction
log file 00001AB2000000F5

Jan 8 20:25:36 mybox postgres[862]: [13] LOG: server process (pid
1602) was terminated by signal 6

Jan 8 20:25:36 mybox postgres[862]: [14] LOG: terminating any other
active server processes

Jan 8 20:25:36 mybox postgres[8601]: [13-1] WARNING: copy: line 1,
Message from PostgreSQL backend:

Jan 8 20:25:36 mybox postgres[8601]: [13-2] ^IThe Postmaster has
informed me that some other backend

Jan 8 20:25:36 mybox postgres[8601]: [13-3] ^Idied abnormally and
possibly corrupted shared memory.

Jan 8 20:25:36 mybox postgres[8601]: [13-4] ^II have rolled back the
current transaction and am

Jan 8 20:25:36 mybox postgres[7284]: [13-1] WARNING: Message from
PostgreSQL backend:

Jan 8 20:25:36 mybox postgres[3658]: [13-1] WARNING: Message from
PostgreSQL backend:



When the error is happening, we see at one time, there are 563 WAL files
under pg_xlog. My understanding is we should expect

61 WAL files, each 16M.



Our /my/pg_xlog is 10G partition and typically has 8G left.



The table size of being vacuumed full is 35G.



What can I do to fix this mess??



Regards,




Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 01:15 AM
Tom Lane
 
Posts: n/a
Default Re: Vacuum full - disk space eaten by WAL logfiles

"Lee Wu" <Lwu@mxlogic.com> writes:
> When we do weekly "vacuum full", PG uses all space and causes PG down.


This implies that checkpoints aren't completing for some reason.
If they were, they'd be recycling WAL space.

I'm not aware of any problems in 7.3 that would block a checkpoint
indefinitely, but we have seen cases where it just took too darn long
to do the checkpoint --- implying either a ridiculously large
shared_buffers setting, or a drastic shortage of I/O bandwidth.

You might want to try strace'ing the checkpoint process to see if it
seems to be making progress or not.

Also, are you certain that this is happening during a VACUUM? The
log messages you show refer to COPY commands.

> PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96


Are you aware of the number and significance of post-7.3.2 bug fixes
in the 7.3 branch? You really ought to be on 7.3.8, if you can't afford
to migrate to 7.4 right now.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:27 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com