Unix Technical Forum

12 hour table vacuums

This is a discussion on 12 hour table vacuums within the Pgsql Performance forums, part of the PostgreSQL category; --> We vacuum only a few of our tables nightly, this one is the last one because it takes longer ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 10:41 AM
Ron St-Pierre
 
Posts: n/a
Default 12 hour table vacuums

We vacuum only a few of our tables nightly, this one is the last one
because it takes longer to run. I'll probably re-index it soon, but I
would appreciate any advice on how to speed up the vacuum process (and
the db in general).

Okay, here's our system:
postgres 8.1.4
Linux version 2.4.21
Red Hat Linux 3.2.3
8 GB ram
Intel(R) Xeon(TM) CPU 3.20GHz
Raid 5
autovacuum=off
serves as the application server and database server
server is co-located in another city, hardware upgrade is not
currently an option

Here's the table information:
The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. It
is probably our 'key' table in the database and gets called by almost
every query (usually joined to others). The table gets updated only
about 10 times a day. We were running autovacuum but it interfered with
the updates to we shut it off. We vacuum this table nightly, and it
currently takes about 12 hours to vacuum it. Not much else is running
during this period, nothing that should affect the table.

Here are the current non-default postgresql.conf settings:
max_connections = 100
shared_buffers = 50000
work_mem = 9192
maintenance_work_mem = 786432
max_fsm_pages = 70000
vacuum_cost_delay = 200
vacuum_cost_limit = 100
bgwriter_delay = 10000
fsync = on
checkpoint_segments = 64
checkpoint_timeout = 1800
effective_cache_size = 270000
random_page_cost = 2
log_destination = 'stderr'
redirect_stderr = on
client_min_messages = warning
log_min_messages = warning
stats_start_collector = off
stats_command_string = on
stats_block_level = on
stats_row_level = on
autovacuum = off
autovacuum_vacuum_threshold = 2000
deadlock_timeout = 10000
max_locks_per_transaction = 640
add_missing_from = on

As I mentioned, any insights into changing the configuration to optimize
performance are most welcome.

Thanks

Ron

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 10:41 AM
Bill Moran
 
Posts: n/a
Default Re: 12 hour table vacuums

In response to Ron St-Pierre <ron.pgsql@shaw.ca>:

> We vacuum only a few of our tables nightly, this one is the last one
> because it takes longer to run. I'll probably re-index it soon, but I
> would appreciate any advice on how to speed up the vacuum process (and
> the db in general).


I doubt anyone can provide meaningful advice without the output of
vacuum verbose.

>
> Okay, here's our system:
> postgres 8.1.4
> Linux version 2.4.21
> Red Hat Linux 3.2.3
> 8 GB ram
> Intel(R) Xeon(TM) CPU 3.20GHz
> Raid 5
> autovacuum=off
> serves as the application server and database server
> server is co-located in another city, hardware upgrade is not
> currently an option
>
> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. It
> is probably our 'key' table in the database and gets called by almost
> every query (usually joined to others). The table gets updated only
> about 10 times a day. We were running autovacuum but it interfered with
> the updates to we shut it off. We vacuum this table nightly, and it
> currently takes about 12 hours to vacuum it. Not much else is running
> during this period, nothing that should affect the table.
>
> Here are the current non-default postgresql.conf settings:
> max_connections = 100
> shared_buffers = 50000
> work_mem = 9192
> maintenance_work_mem = 786432
> max_fsm_pages = 70000
> vacuum_cost_delay = 200
> vacuum_cost_limit = 100
> bgwriter_delay = 10000
> fsync = on
> checkpoint_segments = 64
> checkpoint_timeout = 1800
> effective_cache_size = 270000
> random_page_cost = 2
> log_destination = 'stderr'
> redirect_stderr = on
> client_min_messages = warning
> log_min_messages = warning
> stats_start_collector = off
> stats_command_string = on
> stats_block_level = on
> stats_row_level = on
> autovacuum = off
> autovacuum_vacuum_threshold = 2000
> deadlock_timeout = 10000
> max_locks_per_transaction = 640
> add_missing_from = on
>
> As I mentioned, any insights into changing the configuration to optimize
> performance are most welcome.
>
> Thanks
>
> Ron
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>
>
>
>
>



--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

************************************************** **************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
************************************************** **************

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 10:41 AM
Alvaro Herrera
 
Posts: n/a
Default Re: 12 hour table vacuums

Ron St-Pierre wrote:

> Okay, here's our system:
> postgres 8.1.4


Upgrade to 8.1.10

> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes.


60 indexes? You gotta be kidding. You really have 60 columns on which
to scan?

> vacuum_cost_delay = 200
> vacuum_cost_limit = 100


Isn't this a bit high? What happens if you cut the delay to, say, 10?
(considering you've lowered the limit to half the default)

--
Alvaro Herrera Developer, http://www.PostgreSQL.org/
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude." (Brian Kernighan)

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 10:41 AM
Tom Lane
 
Posts: n/a
Default Re: 12 hour table vacuums

Ron St-Pierre <ron.pgsql@shaw.ca> writes:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes. It
> is probably our 'key' table in the database and gets called by almost
> every query (usually joined to others). The table gets updated only
> about 10 times a day. We were running autovacuum but it interfered with
> the updates to we shut it off. We vacuum this table nightly, and it
> currently takes about 12 hours to vacuum it. Not much else is running
> during this period, nothing that should affect the table.


Here is your problem:

> vacuum_cost_delay = 200


If you are only vacuuming when nothing else is happening, you shouldn't
be using vacuum_cost_delay at all: set it to 0. In any case this value
is probably much too high. I would imagine that if you watch the
machine while the vacuum is running you'll find both CPU and I/O load
near zero ... which is nice, unless you would like the vacuum to finish
sooner.

In unrelated comments:

> maintenance_work_mem = 786432


That seems awfully high, too.

> max_fsm_pages = 70000


And this possibly too low --- are you sure you are not leaking disk
space?

> stats_start_collector = off
> stats_command_string = on
> stats_block_level = on
> stats_row_level = on


These are not self-consistent.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 10:41 AM
Csaba Nagy
 
Posts: n/a
Default Re: 12 hour table vacuums

On Tue, 2007-10-23 at 08:53 -0700, Ron St-Pierre wrote:
> [snip] We were running autovacuum but it interfered with
> the updates to we shut it off.


This is not directly related to your question, but it might be good for
your DB: you don't need to turn off autovacuum, you can exclude tables
individually from being autovacuumed by inserting the appropriate rows
in pg_autovacuum. See:

http://www.postgresql.org/docs/8.1/s...utovacuum.html

We also do have here a few big tables which we don't want autovacuum to
touch, so we disable them via pg_autovacuum. There are a few really big
ones which change rarely - those we only vacuum via a DB wide vacuum in
the weekend (which for us is a low activity period). If you say your
table is only changed rarely, you might be OK too with such a setup...

Cheers,
Csaba.



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 10:41 AM
Ron St-Pierre
 
Posts: n/a
Default Re: 12 hour table vacuums

Bill Moran wrote:
> In response to Ron St-Pierre <ron.pgsql@shaw.ca>:
>
>
>> We vacuum only a few of our tables nightly, this one is the last one
>> because it takes longer to run. I'll probably re-index it soon, but I
>> would appreciate any advice on how to speed up the vacuum process (and
>> the db in general).
>>

>
> I doubt anyone can provide meaningful advice without the output of
> vacuum verbose.
>
>

The cron job is still running
/usr/local/pgsql/bin/vacuumdb -d imperial -t stock.fdata -v -z >
/usr/local/pgsql/bin/fdata.txt
I'll post the output when it's finished.

Ron


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 10:41 AM
Gregory Stark
 
Posts: n/a
Default Re: 12 hour table vacuums

"Ron St-Pierre" <ron.pgsql@shaw.ca> writes:

> We vacuum only a few of our tables nightly, this one is the last one because it
> takes longer to run. I'll probably re-index it soon, but I would appreciate any
> advice on how to speed up the vacuum process (and the db in general).

....
> vacuum_cost_delay = 200


Well speeding up vacuum isn't really useful in itself. In fact you have vacuum
configured to run quite slowly by having vacuum_cost_delay set so high. You
have it set to sleep 200ms every few pages. If you lower that it'll run faster
but take more bandwidth away from the foreground tasks.

> Here's the table information:
> The table has 140,000 rows, 130 columns (mostly NUMERIC), 60 indexes.


For what it's worth NUMERIC columns take more space than you might expect.
Figure a minimum of 12 bytes your rows are at about 1.5k each even if the
non-numeric columns aren't large themselves. What are the other columns?

> We were running autovacuum but it interfered with the updates to we shut it
> off.


Was it just the I/O bandwidth? I'm surprised as your vacuum_cost_delay is
quite high. Manual vacuum doesn't do anything differently from autovacuum,
neither should interfere directly with updates except by taking away
I/O bandwidth.

> We vacuum this table nightly, and it currently takes about 12 hours to
> vacuum it. Not much else is running during this period, nothing that should
> affect the table.


Is this time increasing over time? If once a day isn't enough then you may be
accumulating more and more dead space over time. In which case you may be
better off running it during prime time with a large vacuum_cost_delay (like
the 200 you have configured) rather than trying to get to run fast enough to
fit in the off-peak period.

> deadlock_timeout = 10000


I would not suggest having this quite this high. Raising it from the default
is fine but having a value larger than your patience is likely to give you the
false impression that something is hung if you should ever get a deadlock.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 10:41 AM
Bill Moran
 
Posts: n/a
Default Re: 12 hour table vacuums

In response to Ron St-Pierre <ron.pgsql@shaw.ca>:

> Bill Moran wrote:
> > In response to Ron St-Pierre <ron.pgsql@shaw.ca>:
> >
> >
> >> We vacuum only a few of our tables nightly, this one is the last one
> >> because it takes longer to run. I'll probably re-index it soon, but I
> >> would appreciate any advice on how to speed up the vacuum process (and
> >> the db in general).
> >>

> >
> > I doubt anyone can provide meaningful advice without the output of
> > vacuum verbose.


Understood, however I may have spoken too soon. It appears that Tom
found an obvious issue with your config that seems likely to be the
problem.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 10:41 AM
Ron St-Pierre
 
Posts: n/a
Default Re: 12 hour table vacuums

Tom Lane wrote:
> Here is your problem:
>
>
>> vacuum_cost_delay = 200
>>

>
> If you are only vacuuming when nothing else is happening, you shouldn't
> be using vacuum_cost_delay at all: set it to 0. In any case this value
> is probably much too high. I would imagine that if you watch the
> machine while the vacuum is running you'll find both CPU and I/O load
> near zero ... which is nice, unless you would like the vacuum to finish
> sooner.
>

Yeah, I've noticed that CPU, mem and I/O load are really low when this
is running. I'll change that setting.
> In unrelated comments:
>
>
>> maintenance_work_mem = 786432
>>

>
> That seems awfully high, too.
>
>

Any thoughts on a more reasonable value?
>> max_fsm_pages = 70000
>>

>
> And this possibly too low ---

The default appears to be 20000, so I upped it to 70000. I'll try 160000
(max_fsm_relations*16).
> are you sure you are not leaking disk
> space?
>
>

What do you mean leaking disk space?
>> stats_start_collector = off
>> stats_command_string = on
>> stats_block_level = on
>> stats_row_level = on
>>

>
> These are not self-consistent.
>
> regards, tom lane
>
>



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 10:41 AM
Harald Fuchs
 
Posts: n/a
Default Re: 12 hour table vacuums

In article <471E26E4.7040804@shaw.ca>,
Ron St-Pierre <ron.pgsql@shaw.ca> writes:

>> For what it's worth NUMERIC columns take more space than you might expect.
>> Figure a minimum of 12 bytes your rows are at about 1.5k each even if the
>> non-numeric columns aren't large themselves. What are the other columns?


> The NUMERIC columns hold currency related values, with values ranging
> from a few cents to the billions, as well as a few negative numbers.


What's the required precision? If it's just cents (or maybe tenths
thereof), you could use BIGINT to store the amount in this precision.
This would give you exact values with much less space.


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:35 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com