Unix Technical Forum

SEO

vBulletin Search Engine Optimization


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 05:32 AM
Tom Lane
 
Posts: n/a
Default Proposal: vacuum and autovacuum parameters to control freezing

I wrote:
> It's usually going to be the case that the oldest datvacuumxid is
> template0's, meaning that it will never be possible to truncate clog
> until autovacuum decides that template0 is at risk of wraparound and
> goes and vacuums it. Shortening the freeze horizon will reduce the size
> that pg_clog occupies just *after* that happens, but we're still going
> to see pg_clog bloating up to something close to 256MB before autovacuum
> kicks in.


After further thought I see that there are actually two parameters
involved in this process:

1. the critical age (currentXID - relfrozenxid) beyond which autovacuum
will force a vacuum of a particular table to forestall XID wraparound.
(Note: as the 8.2 code stands, the system will launch autovacuums even
when autovac is nominally disabled in order to fix tables that have
exceeded a hard-wired critical age.)

2. the freeze distance vacuum (whether auto or normal) uses to determine
the new cutoff point, ie, the new relfrozenxid for the table.

We can make a few observations:

* For a table that otherwise goes unvacuumed, the interval between
forced anti-wraparound vacuums will be critical_age - freeze_distance.
Therefore, for large static tables there is value in being able to
adjust this difference to be as large as possible.

* The size of pg_clog is determined by the system-wide maximum of
critical_age + number-of-transactions-needed-to-finish-vacuuming.
Therefore, critical_age is the knob we must expose if we want to
provide user control of pg_clog growth.

* It might seem that there's no point in per-table adjustment of
critical_age, since only the system-wide maximum means anything for
resource consumption. I'm not so sure though --- for a really large
table, the time needed to finish vacuuming it could be significant,
meaning it would need a lower critical age than other tables. With the
current one-process-at-a-time autovac infrastructure, this probably
isn't very important, but we've been talking about allowing multiple
parallel autovacuums specifically to deal with the problem of some
tables being much larger than others.

So it seems to me that an argument can be made for creating two new
GUC variables and adding two columns to pg_autovacuum:

vacuum_freeze_distance: number of transactions back from current
that a VACUUM will use as the freeze cutoff point, ie, XIDs older
than that will be replaced by FrozenXID, and the cutoff point will
become the table's new relfrozenxid value. Valid range zero to
perhaps 1 billion. VACUUM FREEZE is a shorthand for doing a
vacuum with vacuum_freeze_distance = 0.

autovacuum_freeze_limit: maximum age of relfrozenxid before autovacuum
will force a vacuum for anti-wraparound purposes. Valid range perhaps
100 million to (2 billion - 100 million).

pg_autovacuum.freeze_distance: per-table vacuum_freeze_distance setting
for autovacuum to use.

pg_autovacuum.freeze_limit: per-table autovacuum_freeze_limit for
autovacuum to use.

(I'm not wedded to these names, anyone have better ideas?)

I'd propose default values of 200 million for autovacuum_freeze_limit
and half that for vacuum_freeze_distance, resulting in a maximum pg_clog
size of 50MB and forced autovacs about every 100 million transactions.

One minor point is that while the values of these variables have to have
sane relationships to each other, the GUC infrastructure doesn't really
allow us to enforce such a constraint directly (the behavior would be
too dependent on which variable got set first). I'd suggest making
vacuum just silently limit the effective freeze_distance to not more
than half of the system's autovacuum_freeze_limit, rather than trying
to enforce any relationship within GUC.

This is kind of a lot to be inventing in late beta, but if we want to
have a really credible solution to the WAL-versus-freezing problem
I think we need to do all of this.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 05:32 AM
Simon Riggs
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

On Sat, 2006-11-04 at 12:35 -0500, Tom Lane wrote:
> I wrote:
> > It's usually going to be the case that the oldest datvacuumxid is
> > template0's, meaning that it will never be possible to truncate clog
> > until autovacuum decides that template0 is at risk of wraparound and
> > goes and vacuums it. Shortening the freeze horizon will reduce the size
> > that pg_clog occupies just *after* that happens, but we're still going
> > to see pg_clog bloating up to something close to 256MB before autovacuum
> > kicks in.

>
> After further thought I see that there are actually two parameters
> involved in this process:
>
> 1. the critical age (currentXID - relfrozenxid) beyond which autovacuum
> will force a vacuum of a particular table to forestall XID wraparound.
> (Note: as the 8.2 code stands, the system will launch autovacuums even
> when autovac is nominally disabled in order to fix tables that have
> exceeded a hard-wired critical age.)
>
> 2. the freeze distance vacuum (whether auto or normal) uses to determine
> the new cutoff point, ie, the new relfrozenxid for the table.


I guess I don't fully understand the way you've described this, so I'd
like to put my own understanding to see if they both agree. Getting that
right is the key to understanding the proposal more fully (for me) - I
get the bit about 2 parameters...

If we take the current Xid as zero, we can go back in time to various
Xids using a timeline. That timeline can be divided into various Eras,
as with Geologic time (Jurassic, Triassic, Ice Ages etc). With
PostgreSQL, we used to have 5 eras:

1. In Progress Transactions 0 - OldestXmin
<---- limit is OldestXmin
2. Completed, normal Xids, status in clog
<---- limit is oldest Xid in clog (no specific name in code)
3. Completed, normal Xids, status marked on tuple
<---- limit is Freeze distance?
4. Frozen Xids, tuples frozen early to ensure nothing passes Wrap
<---- critical age? - practical last point to forestall Wrap
<---- limit is Wraparound
5. Frozen Xids (PreHistory)

now we have agreed to have only 4 eras, IIRC:

1. In Progress Transactions 0 - OldestXmin
<---- limit is OldestXmin
2. Completed, normal Xids, status in clog
<---- limit is Freeze distance?
3. Frozen Xids, tuples frozen early to ensure nothing passes Wrap
<---- critical age? - practical last point to forestall Wrap
<---- limit is Wraparound
4. Frozen Xids (PreHistory)

Perhaps you could edit the above if needed? I'm not making a separate
proposal, just trying to get it very clear.

If we can explain this simply now, then we stand a chance of other
people understanding it as well and setting these parameters correctly.
Otherwise we'll be doing the explanation hundreds of times on list/IRC.

AFICS freeze_limit and freeze_distance are both expressed in number of
Xids before current, so the "units" are the same for both. In that case,
maybe slightly more differentiated names would be appropriate.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 05:32 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

Tom Lane wrote:

> * It might seem that there's no point in per-table adjustment of
> critical_age, since only the system-wide maximum means anything for
> resource consumption. I'm not so sure though --- for a really large
> table, the time needed to finish vacuuming it could be significant,
> meaning it would need a lower critical age than other tables. With the
> current one-process-at-a-time autovac infrastructure, this probably
> isn't very important, but we've been talking about allowing multiple
> parallel autovacuums specifically to deal with the problem of some
> tables being much larger than others.


I think a global critical_age parameter is just fine. If you have one
huge table that takes a long time to vacuum, just adjust critical_age so
that there's enough time for the huge table vacuum to finish before
wrap-around. That means that other smaller tables are vacuumed more
frequently than would otherwise be necessary, but that's not a big deal
if the other tables really are much smaller.

> pg_autovacuum.freeze_distance: per-table vacuum_freeze_distance setting
> for autovacuum to use.


Shouldn't this be used for manual vacuums as well?

> I'd propose default values of 200 million for autovacuum_freeze_limit
> and half that for vacuum_freeze_distance, resulting in a maximum pg_clog
> size of 50MB and forced autovacs about every 100 million transactions.


Sounds fine to me.

> One minor point is that while the values of these variables have to have
> sane relationships to each other, the GUC infrastructure doesn't really
> allow us to enforce such a constraint directly (the behavior would be
> too dependent on which variable got set first). I'd suggest making
> vacuum just silently limit the effective freeze_distance to not more
> than half of the system's autovacuum_freeze_limit, rather than trying
> to enforce any relationship within GUC.


Makes sense.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 05:32 AM
Tom Lane
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

"Simon Riggs" <simon@2ndquadrant.com> writes:
> now we have agreed to have only 4 eras, IIRC:


> 1. In Progress Transactions 0 - OldestXmin
> <---- limit is OldestXmin
> 2. Completed, normal Xids, status in clog
> <---- limit is Freeze distance?
> 3. Frozen Xids, tuples frozen early to ensure nothing passes Wrap
> <---- critical age? - practical last point to forestall Wrap
> <---- limit is Wraparound
> 4. Frozen Xids (PreHistory)


You've got the eras right but not the boundaries. The limit on era 2 is
the current system-wide minimum relfrozenxid (or equivalently, the
current system-wide minimum datfrozenxid), which is where we have
truncated clog. That will normally be a little more than
autovacuum_freeze_limit --- the difference corresponding to the
"reaction time" needed to fire up autovac and get through all the tables
that have exceeded autovacuum_freeze_limit. The freeze_distance has to
be considerably *less* than this, else we'll constantly be firing new
autovac cycles each of which will freeze just a few more tuples.

> Perhaps you could edit the above if needed?


1. In Progress Transactions 0 - OldestXmin
<---- limit is OldestXmin
2. Completed, normal Xids, status in clog
<---- limit is length of clog (a bit more than freeze_limit)
<---- Xids older than freeze_distance will be frozen whenever
next visited by VACUUM, but there is no forcing function
until they exceed freeze_limit
3. Frozen Xids, tuples frozen early to ensure nothing passes Wrap
<---- limit is Wraparound
4. Frozen Xids (PreHistory)

We also have that the time between forced autovacuums of unchanging
tuples is approximately freeze_limit - freeze_distance.

> AFICS freeze_limit and freeze_distance are both expressed in number of
> Xids before current, so the "units" are the same for both.


Check. Actually, as coded there's a bit of difference: freeze_distance
is subtracted from OldestXmin whereas freeze_limit is subtracted from
ReadNextTransactionId(). Normally OldestXmin should be so much smaller
than these parameters that it won't matter, but vacuum.c does contain
logic to do something reasonable if not.

> In that case,
> maybe slightly more differentiated names would be appropriate.


Got a suggestion? I think the names *should* be clearly related, but
as I said, I'm by no means wedded to these particular ones.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 05:32 AM
Tom Lane
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> I think a global critical_age parameter is just fine. If you have one
> huge table that takes a long time to vacuum, just adjust critical_age so
> that there's enough time for the huge table vacuum to finish before
> wrap-around. That means that other smaller tables are vacuumed more
> frequently than would otherwise be necessary, but that's not a big deal
> if the other tables really are much smaller.


Well, that's what you have to do right now, but as soon as we support
multiple autovac processes it'll be useful to do the other; so I figured
we might as well add the pg_autovacuum column while we're at it.

>> pg_autovacuum.freeze_distance: per-table vacuum_freeze_distance setting
>> for autovacuum to use.


> Shouldn't this be used for manual vacuums as well?


I thought about that but it seemed pretty dubious --- manual vacuums
don't look at pg_autovacuum for anything else, eg not for the cost
variables.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 05:32 AM
Tom Lane
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

I wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> Perhaps you could edit the above if needed?


> <---- Xids older than freeze_distance will be frozen whenever
> next visited by VACUUM, but there is no forcing function
> until they exceed freeze_limit


>> In that case,
>> maybe slightly more differentiated names would be appropriate.


> Got a suggestion? I think the names *should* be clearly related, but
> as I said, I'm by no means wedded to these particular ones.


After re-reading the above, it strikes me that maybe names based around
"freeze_min" and "freeze_max" would be useful?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 05:32 AM
Simon Riggs
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

On Sun, 2006-11-05 at 12:01 -0500, Tom Lane wrote:

> After re-reading the above, it strikes me that maybe names based around
> "freeze_min" and "freeze_max" would be useful?


Works for me. They are clearly related, yet different and allow a
straightforward explanation of their need and use.

e.g.

vacuum_freeze_min The latest TransactionId that will be "frozen" during
a VACUUM is calculated by CurrentTransactionId - vacuum_freeze_min.

vacuum_freeze_max
The maximum age, calculated as distance from CurrentTransactionId, that
will be allowed before a autovacuum will be forced for that database
object.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-12-2008, 05:32 AM
Tom Lane
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

"Simon Riggs" <simon@2ndquadrant.com> writes:
> On Sun, 2006-11-05 at 12:01 -0500, Tom Lane wrote:
>> After re-reading the above, it strikes me that maybe names based around
>> "freeze_min" and "freeze_max" would be useful?


> Works for me. They are clearly related, yet different and allow a
> straightforward explanation of their need and use.


Sold, I'll rename them before committing the patch.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-12-2008, 05:32 AM
Gregory Stark
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters to control freezing

"Simon Riggs" <simon@2ndquadrant.com> writes:

> vacuum_freeze_min The latest TransactionId that will be "frozen" during
> a VACUUM is calculated by CurrentTransactionId - vacuum_freeze_min.
>
> vacuum_freeze_max
> The maximum age, calculated as distance from CurrentTransactionId, that
> will be allowed before a autovacuum will be forced for that database
> object.


I think it's clearer if "min" and "max" are considered adjectives and always
have a subject they modify. Otherwise it's unclear what they refer to.

So "vacuum_freeze_min_age" and "vacuum_freeze_max_age" instead.

That way it's unambiguous which is which. Ie, that it's minimum and maximum
age and not minimum and maximum transaction id which would be the other way
around.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-12-2008, 05:32 AM
Simon Riggs
 
Posts: n/a
Default Re: Proposal: vacuum and autovacuum parameters tocontrol freezing

On Sun, 2006-11-05 at 14:47 -0500, Gregory Stark wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>
> > vacuum_freeze_min The latest TransactionId that will be "frozen" during
> > a VACUUM is calculated by CurrentTransactionId - vacuum_freeze_min.
> >
> > vacuum_freeze_max
> > The maximum age, calculated as distance from CurrentTransactionId, that
> > will be allowed before a autovacuum will be forced for that database
> > object.

>
> I think it's clearer if "min" and "max" are considered adjectives and always
> have a subject they modify. Otherwise it's unclear what they refer to.
>
> So "vacuum_freeze_min_age" and "vacuum_freeze_max_age" instead.
>
> That way it's unambiguous which is which. Ie, that it's minimum and maximum
> age and not minimum and maximum transaction id which would be the other way
> around.


Sounds logical. Hadn't seen that you could take it both ways.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:35 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453