Unix Technical Forum

Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

This is a discussion on Re: [WIP PATCH] Lazily assign xids for toplevel Transactions within the pgsql Hackers forums, part of the PostgreSQL category; --> I wrote: > "Florian G. Pflug" <fgp@phlo.org> writes: >> Yeah - I do not really like that dual-locking thing ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 09:39 PM
Tom Lane
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

I wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Yeah - I do not really like that dual-locking thing either. But it makes
>> prepared transaction handling much easier - if we were to only lock the
>> RID, we'd have to store the rid<->xid mapping for prepared transactions


> Hmmm .... that's a good point. Not sure how to include prepared xacts
> in the scheme.


After further thought I think we should avoid depending on PIDs as part
of a lock tag --- their behavior is too system-dependent, in particular
we have no right to assume that when a backend exits the same PID won't
be reassigned shortly after, possibly leading to confusion.

Instead, I suggest that we keep a session counter in shared memory and
have each backend assign itself a session ID at startup using that.
A 32-bit session ID in combination with a 32-bit locally assigned
transaction number should be sufficiently unique to identify a
transaction (prepared or otherwise) for the purposes of locking.
These "transient XIDs" only need to be unique as long as the transaction
exists plus shortly thereafter (to avoid race conditions when someone
waits for a transaction that actually terminated a moment before).
So wraparound of the counters isn't a problem, although we probably
want to reserve zero as an invalid value.

I think we only need a transient XID of this sort for top-level
transactions, not subtransactions.

I thought for awhile about how to avoid taking a lock on the regular XID
as well as the transient XID, but it seems pretty messy, particularly
if we don't want to assign transient XIDs to subtransactions. Probably
best not to go there, at least not in the first-cut patch.

To make CREATE INDEX CONCURRENTLY work, we'd need two things:

* GetLockConflicts would need to report the transient XIDs of the
conflicting xacts, not regular XIDs, since they might not have regular
XIDs. Then we'd wait on those locks instead of regular-XID locks.

* The second phase where we wait out transactions that can still see
old tuples doesn't work because such transactions won't necessarily
be listed in the snapshot. Instead, what we have to do is look
through the ProcArray for transactions whose advertised xmin is
less than the xmax of our reference snapshot. When we find one,
wait for it using its transient XID.

AFAICT, C.I.C. is currently the only place in the system where we
really need transient XIDs at all. Everyplace else that we need to
wait for a transaction, it's because we found its regular XID in
a tuple we want to lock or modify. So the whole thing is a bit
annoying. Maybe we could get rid of the extra overhead with some
shenanigans inside the lock manager, like not bothering to create
a data structure representing the holding of a transient-XID lock
until such time as C.I.C. actually tries to wait for it. But
again, that seems like a second-pass optimization.

Can anyone suggest better terminology than "transient XID"?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 09:40 PM
Florian G. Pflug
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

Tom Lane wrote:
> I wrote:
>> "Florian G. Pflug" <fgp@phlo.org> writes:
>>> Yeah - I do not really like that dual-locking thing either. But it makes
>>> prepared transaction handling much easier - if we were to only lock the
>>> RID, we'd have to store the rid<->xid mapping for prepared transactions

>
>> Hmmm .... that's a good point. Not sure how to include prepared xacts
>> in the scheme.

>
> After further thought I think we should avoid depending on PIDs as part
> of a lock tag --- their behavior is too system-dependent, in particular
> we have no right to assume that when a backend exits the same PID won't
> be reassigned shortly after, possibly leading to confusion.
>
> Instead, I suggest that we keep a session counter in shared memory and
> have each backend assign itself a session ID at startup using that.
> A 32-bit session ID in combination with a 32-bit locally assigned
> transaction number should be sufficiently unique to identify a
> transaction (prepared or otherwise) for the purposes of locking.
> These "transient XIDs" only need to be unique as long as the transaction
> exists plus shortly thereafter (to avoid race conditions when someone
> waits for a transaction that actually terminated a moment before).
> So wraparound of the counters isn't a problem, although we probably
> want to reserve zero as an invalid value.
>
> I think we only need a transient XID of this sort for top-level
> transactions, not subtransactions.


Sounds good, if we decide to go with the transient XID idea. So below
for an alternative that I just came up with.

> To make CREATE INDEX CONCURRENTLY work, we'd need two things:
>
> * GetLockConflicts would need to report the transient XIDs of the
> conflicting xacts, not regular XIDs, since they might not have regular
> XIDs. Then we'd wait on those locks instead of regular-XID locks.


Yes. This is exactly what my patch does today.

> * The second phase where we wait out transactions that can still see
> old tuples doesn't work because such transactions won't necessarily
> be listed in the snapshot. Instead, what we have to do is look
> through the ProcArray for transactions whose advertised xmin is
> less than the xmax of our reference snapshot. When we find one,
> wait for it using its transient XID.


> AFAICT, C.I.C. is currently the only place in the system where we
> really need transient XIDs at all. Everyplace else that we need to
> wait for a transaction, it's because we found its regular XID in
> a tuple we want to lock or modify. So the whole thing is a bit
> annoying. Maybe we could get rid of the extra overhead with some
> shenanigans inside the lock manager, like not bothering to create
> a data structure representing the holding of a transient-XID lock
> until such time as C.I.C. actually tries to wait for it. But
> again, that seems like a second-pass optimization.


I've given some thought to that. There are two distinct things we
need to be able to wait for

1) Until all current holders of a lock, grantmask conflicts with
a given locklevel have dropped their lock

2) Until all currently in-use snapshots have an xmin larger than
some given value (The xmax of the reference snapshot).

(1) Could be solved directly in the lock manager. We'd need some
mechanism to wake up a process whenever someone releases a
ceratin lock.

(2) Could be done by acquireing a ShareLock (with a new locktype
LOCKTYPE_XMIN) on the xmin of a transaction's serializable
snapshot when it's created.
The second waiting phase of concurrent index builds would then be
a) Find the oldest xmin in the ProcArray.
b) If that xmin is equal or greater than the xmax of our
reference snapshot, we're done.
c) Wait until the ExclusiveLock (for LOCKTYPE_TRANSACTION)
is released on that xmin. After that point, new transactions
will compute an xmin greater than the oldest one we found
in the ProcArray, because the limiting transactions has
exited, and because ReadNewTransactionId returns a value
greater than that xmin too (Otherwise, we'd have exited in (b)).
d) Wait for all current holders of LOCKTYPE_XMIN to release
their locks. (Using the machinery needed for (1)). No
new holders can show up, because new snapshots will computer
a larger xmin.
e) Goto a).

I could code (2), but I'd need help with (1) - The details of the locking
subsystems are still somewhat a mystery to me.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 09:40 PM
Tom Lane
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

"Florian G. Pflug" <fgp@phlo.org> writes:
> Sounds good, if we decide to go with the transient XID idea. So below
> for an alternative that I just came up with.


This proposal appears to require taking and releasing a brand-new lock
type every time a snapshot is made or destroyed. That is certainly not
going to be less overhead than the transient-XID scheme. At least in
READ COMMITTED mode, there are normally multiple snapshots taken per
transaction.

(Something worth noting here is that I expect soon, probably 8.4,
we will fix things so that what a backend advertises in MyProc->xmin
is the xmin of its oldest still-live snapshot. That means that xmin
will change intra-transaction in READ COMMITTED mode, and thus that
we would indeed need to take and release the sort of lock you are
suggesting each time.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-15-2008, 09:40 PM
Florian G. Pflug
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Sounds good, if we decide to go with the transient XID idea. So below
>> for an alternative that I just came up with.

>
> This proposal appears to require taking and releasing a brand-new lock
> type every time a snapshot is made or destroyed. That is certainly not
> going to be less overhead than the transient-XID scheme. At least in
> READ COMMITTED mode, there are normally multiple snapshots taken per
> transaction.


Only for the serializable shapsnot I'd have thought, making the overhead
bearable (it's is surely the oldest of all the xact's shapshots) but ...

> (Something worth noting here is that I expect soon, probably 8.4,
> we will fix things so that what a backend advertises in MyProc->xmin
> is the xmin of its oldest still-live snapshot. That means that xmin
> will change intra-transaction in READ COMMITTED mode, and thus that
> we would indeed need to take and release the sort of lock you are
> suggesting each time.)


.... with this in mind, my proposal looks pretty bad :-(

What do you think about solving the requirements of the *first* waiting
phase (Where we wait for current ShareLock holders) inside the lock manager?
The only real downside I can see is that I feel uneasy about messing with
that code... It seems to be subtle, and quick to anger ;-)

For the second phase, I see two options
.) Go forward with the transient XIDs / RIDs approach
.) Do something similar to the indcreatexid idea the HOT patch implements.
This essentially
puts the burden of deciding an index's usability on the using xact,
not on the one creating the index.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-15-2008, 09:40 PM
Tom Lane
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

"Florian G. Pflug" <fgp@phlo.org> writes:
> What do you think about solving the requirements of the *first* waiting
> phase (Where we wait for current ShareLock holders) inside the lock manager?
> The only real downside I can see is that I feel uneasy about messing with
> that code... It seems to be subtle, and quick to anger ;-)


It sounded pretty dubious to me. The problem is that we don't want to
wait until we could actually *get* the lock, we only want to wait out
the conflicting xacts that existed when we looked. This is important
because there might be a steady stream of new xacts acquiring
conflicting locks (ie, a steady stream of writers), and we don't want to
either block them, or have to hope for a window where there are none.
But the lock manager does not currently track who acquired a lock when,
and I think it would add a lot of usually-wasted overhead to do that.

I spent a fair amount of time yesterday trying to think of alternative
solutions, and didn't really think of much. The core reason why C.I.C.
is implemented the way it is is that it's entirely possible that there
will be a deadlock with some other process (ie, the other process is
old enough that we must wait for it, but it's blocked trying to acquire
some lock that conflicts with our ShareUpdateExclusiveLock). Thus,
waiting by trying to acquire XID locks is a good idea because it
automatically detects deadlock, and may even be able to escape the
deadlock by wait-queue rearrangement. (I'm not certain that the latter
is applicable in any situation C.I.C. would get into, but I'm not
certain it's not, either.) Schemes involving "sleep awhile and check
the ProcArray again" are right out because they fail to detect
deadlock. Everything else I could think of involved special new
lockmanager features that would have to still preserve the ability
to handle deadlocks, which didn't sound like something I want to
tackle for this.

So on the whole the extra transaction identifier seems to be the
way to go. I haven't looked at how that interacts with HOT though.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-15-2008, 09:40 PM
Florian G. Pflug
 
Posts: n/a
Default Re: [WIP PATCH] Lazily assign xids for toplevel Transactions

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> What do you think about solving the requirements of the *first* waiting
>> phase (Where we wait for current ShareLock holders) inside the lock manager?
>> The only real downside I can see is that I feel uneasy about messing with
>> that code... It seems to be subtle, and quick to anger ;-)

>
> It sounded pretty dubious to me. The problem is that we don't want to
> wait until we could actually *get* the lock, we only want to wait out
> the conflicting xacts that existed when we looked. This is important
> because there might be a steady stream of new xacts acquiring
> conflicting locks (ie, a steady stream of writers), and we don't want to
> either block them, or have to hope for a window where there are none.
> But the lock manager does not currently track who acquired a lock when,
> and I think it would add a lot of usually-wasted overhead to do that.
>
> I spent a fair amount of time yesterday trying to think of alternative
> solutions, and didn't really think of much. The core reason why C.I.C.
> is implemented the way it is is that it's entirely possible that there
> will be a deadlock with some other process (ie, the other process is
> old enough that we must wait for it, but it's blocked trying to acquire
> some lock that conflicts with our ShareUpdateExclusiveLock). Thus,
> waiting by trying to acquire XID locks is a good idea because it
> automatically detects deadlock, and may even be able to escape the
> deadlock by wait-queue rearrangement. (I'm not certain that the latter
> is applicable in any situation C.I.C. would get into, but I'm not
> certain it's not, either.) Schemes involving "sleep awhile and check
> the ProcArray again" are right out because they fail to detect
> deadlock. Everything else I could think of involved special new
> lockmanager features that would have to still preserve the ability
> to handle deadlocks, which didn't sound like something I want to
> tackle for this.
>
> So on the whole the extra transaction identifier seems to be the
> way to go. I haven't looked at how that interacts with HOT though.


Ok, I'll update the patch to use your global id plus local id
idea than, and remove TemporaryTransactionIds.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:54 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com