Unix Technical Forum

Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazyXID assignment

This is a discussion on Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazyXID assignment within the pgsql Hackers forums, part of the PostgreSQL category; --> Hi When we lazily assign XIDs, we gain another flag beside the existing MyXactMadeTempRelUpdates, MyXactMadeXLogEntry, MyLastRecPtr and smgr's pendingDeletes ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 09:42 PM
Florian G. Pflug
 
Posts: n/a
Default Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazyXID assignment

Hi

When we lazily assign XIDs, we gain another flag beside the existing
MyXactMadeTempRelUpdates, MyXactMadeXLogEntry, MyLastRecPtr and smgr's
pendingDeletes to tell what kind of actions a transaction performed. Adding
TransactionIsIsValid(GetCurrentTransactionIdIfAny( )) on top of that
makes things quite impenetrable - at least for me. So I'm trying to
wrap my head around that logic, and simplify it a bit if possible.
(Nowadays, async commit even adds a bit more complexity)

Currently, we write out a COMMIT record if a transaction either created
any transaction-controlled XLOG entries (MyLastRecPtr.xrecoff != 0), or
scheduled the deletion of files on commit. Afterwards, the xlog is flushed
to the end last record created by the session (ProcLastRecEnd) if the
transaction created any xlog record at all. If we either made
transaction-controlled XLOG entries, or temporary relation updates, we
update the XID status in the CLOG.

An ABORT record is currently created if a transaction either created
transaction-controlled XLOG entries or scheduled the deletion of files
on abort. If we schedules file deletions, we flush the XLOG up to the
ABORT record. If we either made transaction-controlled XLOG entries,
updated temporary relations, or scheduled deletions we update the XID
status in the CLOG.

For subtransaction commit, a COMMIT record is emitted if we either made
transaction-controlled XLOG entries, or updated temporary relations.
No XLOG flush is performed.

Subtransaction ABORTS are handled the same way as regular transaction
aborts.

For toplevel transaction commits, we defer flushing the xlog if
synchronous_commit = off, and we didn't schedule any file deletions.

Now, with lazy XID assignment I believe the following holds true
..) If we didn't assign a valid XID, we cannot have made transaction-controlled
XLOG entries (Can be checked by asserting that the current transaction id
is valid if XLOG_NO_TRAN isn't set in XLogInsert).
..) We cannot have scheduled files for deletion (on either COMMIT or ABORT)
if we don't have a valid XID, since any such deletion will happen together
with a catalog update. Note that this is already assumed to be true for
subtransactions, since they only call RecordSubTransaction{Commit|Abort}
if they have an XID assigned.

I propose to do the following in my lazy XID assignment patch - can
anyone see a hole in that?

..) Get rid of MyLastRecPtr and MyXactMadeTempRelUpdates. Those are
superseeded by TransactionIdIsValid(GetCurrentTransactionIdIfAny( )).
..) Get rid of MyXactMadeXLogEntry. Instead, just reset ProcLast
..) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting
a new toplevel transaction.

Transaction COMMIT:
Write an COMMIT record if and only if we have a valid XID.
Then, flush the XLOG to XactLastRecEnd if that is set, and
synchronous_commit=on.
Aferwards, update the CLOG if and only if we have a valid XID.

Transaction ABORT:
Write an ABORT record if and only if we have a valid XID.
Then, flush the XLOG to XactLastRecEnd if that is set, and
we scheduled on-abort deletions.
Update the CLOG if and only if we have a valid XID.

Subtransaction COMMIT:
Update the CLOG if and only if we have a valid XID.

Subtransaction ABORT:
Write an ABORT record if and only if we have a valid XID.
Then, flush the XLOG to XactLastRecEnd if that is set, and
we scheduled on-abort deletions.
Update the CLOG if and only if we have a valid XID.

I think we might go even further, and *never* flush the XLOG on abort,
since if we crash just before the abort won't log anything either. But
if we leak the leftover files in such a case, that's probably a bad idea.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 09:42 PM
Tom Lane
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment

"Florian G. Pflug" <fgp@phlo.org> writes:
> I propose to do the following in my lazy XID assignment patch - can
> anyone see a hole in that?


Cleaning up this area seems like a good idea. Just FYI, one reason why
there are so many LastRec pointer variables is that the WAL record
format used to include a back-link to the previous record of the same
transaction, so we needed to track that location. Since that's gone,
simplification is definitely possible. A lot of the other behavior
you're looking at "just grew" as incremental optimizations added over
time.

One comment is that at the time we make an entry into smgr's
pending-deletes list, I think we might not have acquired an XID yet
--- if I understand your patch correctly, a CREATE TABLE would acquire
an XID when it makes its first catalog insertion, and that happens
after creating the on-disk table file. So it seems like a good idea
for smgr itself to trigger acquisition of an XID before it makes a
pending-deletes entry. This ensures that you can't have a situation
where you have deletes to record and no XID; otherwise, an elog
between smgr insertion and catalog insertion would lead to just that.

> .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting
> a new toplevel transaction.


I'm not very happy with that name for the variable, because it looks
like it might refer to the last transaction-controlled record we
emitted, rather than the last record of any type. Don't have a really
good suggestion though --- CurXactLastRecEnd is the best I can do.

One thought here is that it's not clear that we really need a concept of
transaction-controlled vs not-transaction-controlled xlog records
anymore. In CVS HEAD, the *only* difference no_tran makes is whether
to set MyLastRecPtr, and you propose removing that variable. This
seems sane to me --- the reason for having the distinction at all was
Vadim's plan to implement transaction UNDO by scanning its xlog records
backwards, and that idea is as dead as a doornail. So we could simplify
matters conceptually if we got rid of any reference to such a
distinction.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 09:42 PM
Florian G. Pflug
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletesand lazy XID assignment

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> I propose to do the following in my lazy XID assignment patch - can
>> anyone see a hole in that?

>
> One comment is that at the time we make an entry into smgr's
> pending-deletes list, I think we might not have acquired an XID yet
> --- if I understand your patch correctly, a CREATE TABLE would acquire
> an XID when it makes its first catalog insertion, and that happens
> after creating the on-disk table file. So it seems like a good idea
> for smgr itself to trigger acquisition of an XID before it makes a
> pending-deletes entry. This ensures that you can't have a situation
> where you have deletes to record and no XID; otherwise, an elog
> between smgr insertion and catalog insertion would lead to just that.


I wonder a bit about the whole special-casing
of COMMITs/ABORTs with pending delete, though. A crash might always leave
stray file around, so there ought to be a way to clean them up anyway.
Still, for now I'll go with your suggestion, and force XID assignment
in the smgr.

>> .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting
>> a new toplevel transaction.

>
> I'm not very happy with that name for the variable, because it looks
> like it might refer to the last transaction-controlled record we
> emitted, rather than the last record of any type. Don't have a really
> good suggestion though --- CurXactLastRecEnd is the best I can do.


Hm.. don't have a good suggestion, either - the reason I want to rename
it is that ProcLastRecEnd doesn't sound like it's be reset at transaction
start.

> One thought here is that it's not clear that we really need a concept of
> transaction-controlled vs not-transaction-controlled xlog records
> anymore. In CVS HEAD, the *only* difference no_tran makes is whether
> to set MyLastRecPtr, and you propose removing that variable. This
> seems sane to me --- the reason for having the distinction at all was
> Vadim's plan to implement transaction UNDO by scanning its xlog records
> backwards, and that idea is as dead as a doornail. So we could simplify
> matters conceptually if we got rid of any reference to such a
> distinction.


I've thinking about keeping XLOG_NO_TRAN, and doing
if (!no_tran)
Assert(TransactionIdIsValid(GetCurrentTransactionI dIfAny())
in xlog.c as a safety measure. We can't make that assertion
unconditionally, I think, because nextval() won't force XID
assigment, but might do XLogInsert.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-15-2008, 09:42 PM
Tom Lane
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment

"Florian G. Pflug" <fgp@phlo.org> writes:
> Tom Lane wrote:
>> One thought here is that it's not clear that we really need a concept of
>> transaction-controlled vs not-transaction-controlled xlog records
>> anymore.


> I've thinking about keeping XLOG_NO_TRAN, and doing
> if (!no_tran)
> Assert(TransactionIdIsValid(GetCurrentTransactionI dIfAny())
> in xlog.c as a safety measure.


Why do you think this is a safety measure? All that it is checking
is whether the caller has preserved an entirely useless distinction.
The real correctness property is that you can't write your XID
into a heap tuple or XLOG record if you haven't acquired an XID,
but that seems nearly tautological.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-15-2008, 09:42 PM
Florian G. Pflug
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletesand lazy XID assignment

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Tom Lane wrote:
>>> One thought here is that it's not clear that we really need a concept of
>>> transaction-controlled vs not-transaction-controlled xlog records
>>> anymore.

>
>> I've thinking about keeping XLOG_NO_TRAN, and doing
>> if (!no_tran)
>> Assert(TransactionIdIsValid(GetCurrentTransactionI dIfAny())
>> in xlog.c as a safety measure.

>
> Why do you think this is a safety measure? All that it is checking
> is whether the caller has preserved an entirely useless distinction.
> The real correctness property is that you can't write your XID
> into a heap tuple or XLOG record if you haven't acquired an XID,
> but that seems nearly tautological.


I was confused. I wanted to protect against the case the an XID hits
the disk, but doesn't show up in any xl_xid field, and therefore might
be reused after crash recovery. But of course, to make that happen
you'd have to actually *store* the XID into the data you pass to
XLogInsert, which is kind of hard if you haven't asked for it first.

So, I now agree, XLOG_NO_TRAN should be buried.

greetings, Florian Pflug

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-15-2008, 09:43 PM
Simon Riggs
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes andlazy XID assignment

On Wed, 2007-08-29 at 19:32 +0200, Florian G. Pflug wrote:

> I propose to do the following in my lazy XID assignment patch


The lazy XID assignment seems to be the key to unlocking this whole
area.

> - can
> anyone see a hole in that?
>
> .) Get rid of MyLastRecPtr and MyXactMadeTempRelUpdates. Those are
> superseeded by TransactionIdIsValid(GetCurrentTransactionIdIfAny( )).
> .) Get rid of MyXactMadeXLogEntry. Instead, just reset ProcLast
> .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting
> a new toplevel transaction.


I followed you up to this point. Nothing bad springs immediately to
mind, but please can you explain the proposals rather than just assert
them and ask us to find the holes?

> I think we might go even further, and *never* flush the XLOG on abort,
> since if we crash just before the abort won't log anything either. But
> if we leak the leftover files in such a case, that's probably a bad idea.


That doesn't gain us much, yet we lose the ability to tell the
difference between an abort and a crash.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-15-2008, 09:43 PM
Florian G. Pflug
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletesand lazy XID assignment

Tom Lane wrote:
> One comment is that at the time we make an entry into smgr's
> pending-deletes list, I think we might not have acquired an XID yet
> --- if I understand your patch correctly, a CREATE TABLE would acquire
> an XID when it makes its first catalog insertion, and that happens
> after creating the on-disk table file. So it seems like a good idea
> for smgr itself to trigger acquisition of an XID before it makes a
> pending-deletes entry. This ensures that you can't have a situation
> where you have deletes to record and no XID; otherwise, an elog
> between smgr insertion and catalog insertion would lead to just that.


Hm.. I was just going to implement this, but I'm now wondering if
thats really worth it.

For smgrcreate, this would catch the following case:
..) CREATE something
..) smgrcreate: Creates file, and puts it onto the delete-on-abort
list
..) We elog() *before* acquiring an XID
..) RecordTransactionAbort or RecordSubTransactionAbort:
We don't write an ABORT record.
..) We crash *before* actually deleting the file

Compare the probability of that happening (The elog *and* the crash)
with the probability of
..) CREATE something
..) smgrcreate: Creates the file
..) We crash *before* we have to chance to commit or abort.

The window in which a crash causes us to leak the file seems to be much
wider in the second case, yet forcing XID assignment will not help to
preven it, unless I'm overlooking something.

In the smgrunlink case, there is no reason at all to force XID assignment,
because if we abort or crash, we don't want to unlink anyway, and if we
survive until we commit, we'll assign the XID during the inevitable catalog
update.

The only thing the forced XID assignment would buy is to be able to stick
if (TransactionIdIsValid(GetCurrentTransactionIdIfAny ()))
Assert(nrels == 0);
into the various Record{Sub|}Transction{Commit|Abort} functions

So unless I'm overlooking something, I believe for now it's best to ignore this
issued, and to do a proper fix in the long run that removes *all* possible
leakages.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-15-2008, 09:43 PM
Tom Lane
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment

"Florian G. Pflug" <fgp@phlo.org> writes:
>> One comment is that at the time we make an entry into smgr's
>> pending-deletes list, I think we might not have acquired an XID yet


> Hm.. I was just going to implement this, but I'm now wondering if
> thats really worth it.


Basically what you'd give up is the ability to Assert() that there are
no deletable files if there's no XID, which seems to me to be an
important cross-check ... although maybe making smgr do that turns
this "cross-check" into a tautology ... hmm. I guess the case that's
bothering me is where we reach commit with deletable files and no XID.
But that should probably be an error condition anyway, ie, we should
error out and turn it into an abort. On the abort side we'd consider
it OK to have files and no XID. Seems reasonable to me.

The only way we could make this more robust is if we could have
WAL-before-data rule for file *creation*, but I think that's not
possible given that we don't know what relfilenode number we will use
until we've successfully created a file. So there will always be
windows where a crash leaks unreferenced files. There's been some
debate about having crash recovery search for and delete such files, but
so far I've resisted it on the grounds that it sounds like data loss
waiting to happen --- someday it'll delete a file you wished it'd kept.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-15-2008, 09:43 PM
Florian G. Pflug
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletesand lazy XID assignment

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>>> One comment is that at the time we make an entry into smgr's
>>> pending-deletes list, I think we might not have acquired an XID yet

>
>> Hm.. I was just going to implement this, but I'm now wondering if
>> thats really worth it.

>
> Basically what you'd give up is the ability to Assert() that there are
> no deletable files if there's no XID, which seems to me to be an
> important cross-check ... although maybe making smgr do that turns
> this "cross-check" into a tautology ... hmm. I guess the case that's
> bothering me is where we reach commit with deletable files and no XID.
> But that should probably be an error condition anyway, ie, we should
> error out and turn it into an abort. On the abort side we'd consider
> it OK to have files and no XID. Seems reasonable to me.


I've done that now, and it turned out nicely. There is an Assertion
on "(nrels == 0) || xid assigned" in the COMMIT path, but
not in the ABORT path. Seems reasonable and safe.

And I'm quite tempted to not flush the XLOG at all during ABORT, and to
only force synchronous commits if one of the to-be-deleted files is
non-temporary. The last idea widens the leakage window quite a bit
though, so I maybe I should rather resist that temptation...

OTOH, it'd allow aynchronous commits for transactions that created
temporary tables.

> The only way we could make this more robust is if we could have
> WAL-before-data rule for file *creation*, but I think that's not
> possible given that we don't know what relfilenode number we will use
> until we've successfully created a file. So there will always be
> windows where a crash leaks unreferenced files. There's been some
> debate about having crash recovery search for and delete such files, but
> so far I've resisted it on the grounds that it sounds like data loss
> waiting to happen --- someday it'll delete a file you wished it'd kept.


It seems doable, but it's not pretty. One possible scheme would be to
emit a record *after* chosing a name but *before* creating the file,
and then a second record when the file is actually created successfully.

Then, during replay we could remember a list of xids and filenames,
and remove those files for which we either haven't seen a "created
successfully" record, or no COMMIT record for the creating xid.

With this scheme, I'd be natural to force XID assignment in smgrcreate,
because we'd actually depend on logging the xid there.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-15-2008, 09:43 PM
Tom Lane
 
Posts: n/a
Default Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment

"Florian G. Pflug" <fgp@phlo.org> writes:
> And I'm quite tempted to not flush the XLOG at all during ABORT, and to
> only force synchronous commits if one of the to-be-deleted files is
> non-temporary.


+1 on the first, but -1 on the second, because we'd have to track
whether deleted files are temp or not ... it's very unclear that it'd
be worth the trouble.

> OTOH, it'd allow aynchronous commits for transactions that created
> temporary tables.


It'd be for xacts that *dropped* temp tables, no? I'm not sure that
is a performance-critical path --- probably it more usually gets done
after the client's already disconnected.

>> The only way we could make this more robust is if we could have
>> WAL-before-data rule for file *creation*, but I think that's not
>> possible given that we don't know what relfilenode number we will use
>> until we've successfully created a file.


> It seems doable, but it's not pretty. One possible scheme would be to
> emit a record *after* chosing a name but *before* creating the file,


No, because the way you know the name is good is a successful
open(O_CREAT).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:58 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com