Unix Technical Forum

BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints

This is a discussion on BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints within the pgsql Bugs forums, part of the PostgreSQL category; --> The following bug has been logged online: Bug reference: 1512 Logged by: Stephen Clouse Email address: stephenc@theiqgroup.com PostgreSQL version: ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Bugs

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 09:37 AM
Stephen Clouse
 
Posts: n/a
Default BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints


The following bug has been logged online:

Bug reference: 1512
Logged by: Stephen Clouse
Email address: stephenc@theiqgroup.com
PostgreSQL version: 8.0.1
Operating system: Fedora Core 3
Description: Assertion failure (lock.c:1537) with SELECT FOR UPDATE
and savepoints
Details:

You need two psql sessions going to reproduce this. Start with this very
simple schema:

CREATE TABLE foo (bar NUMERIC);
INSERT INTO foo VALUES (1);

Now, start session 1:

> BEGIN;
> SELECT * FROM foo WHERE bar = 1 FOR UPDATE;


bar
-----
1
(1 row)

Switch to session 2:

> BEGIN;
> SAVEPOINT foo;
> SELECT * FROM foo WHERE bar = 1 FOR UPDATE;

(Abort this with Ctrl-C)
Cancel request sent
ERROR: canceling query due to user request
> ROLLBACK TO SAVEPOINT foo;


Back to session 1:

> ROLLBACK;

Session 1's backend will now die horribly and trigger a server reset.

Log shows the following as the cause of the server abort:

TRAP: FailedAssertion("!(SHMQueueEmpty(&(lock->procLocks)))", File:
"lock.c", Line: 1537)


I have not achieved guru status with the PostgreSQL code yet, otherwise I'd
send a patch along with this.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 09:37 AM
Michael Fuhr
 
Posts: n/a
Default Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints

On Tue, Mar 01, 2005 at 02:04:30AM +0000, Stephen Clouse wrote:

> TRAP: FailedAssertion("!(SHMQueueEmpty(&(lock->procLocks)))", File:
> "lock.c", Line: 1537)


I can duplicate this on FreeBSD 4.11-STABLE with the latest code
from REL8_0_STABLE (--enable-cassert required during build). Here's
a stack trace:

#0 0x284997ac in kill () from /usr/lib/libc.so.4
#1 0x284db0a6 in abort () from /usr/lib/libc.so.4
#2 0x81ed93b in ExceptionalCondition () at assert.c:51
#3 0x8183191 in LockReleaseAll (lockmethodid=1, allxids=1 '\001') at lock.c:1537
#4 0x8183dbd in ProcReleaseLocks (isCommit=0) at proc.c:439
#5 0x81ffe69 in ResourceOwnerReleaseInternal (owner=0x82fefbc, phase=RESOURCE_RELEASE_LOCKS, isCommit=0 '\000', isTopLevel=1 '\001') at resowner.c:252
#6 0x81ffd12 in ResourceOwnerRelease (owner=0x82fefbc, phase=RESOURCE_RELEASE_LOCKS, isCommit=0 '\000', isTopLevel=1 '\001') at resowner.c:160
#7 0x809bedd in AbortTransaction () at xact.c:1694
#8 0x809c141 in CommitTransactionCommand () at xact.c:1906
#9 0x818ae7e in finish_xact_command () at postgres.c:1843
#10 0x8189da4 in exec_simple_query (query_string=0x836d01c "ROLLBACK;") at postgres.c:965
#11 0x818c2ab in PostgresMain (argc=4, argv=0x82fd274, username=0x82fd24c "mfuhr") at postgres.c:3007
#12 0x8163f41 in BackendRun (port=0x8313600) at postmaster.c:2816
#13 0x8163742 in BackendStartup (port=0x8313600) at postmaster.c:2452
#14 0x8161c9e in ServerLoop () at postmaster.c:1199
#15 0x81616a6 in PostmasterMain (argc=3, argv=0xbfbffc88) at postmaster.c:918
#16 0x8132b15 in main (argc=3, argv=0xbfbffc88) at main.c:268

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-10-2008, 09:37 AM
Tom Lane
 
Posts: n/a
Default Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints

"Stephen Clouse" <stephenc@theiqgroup.com> writes:
> Description: Assertion failure (lock.c:1537) with SELECT FOR UPDATE


It looks to me like the problem is that RemoveFromWaitQueue() is too
lazy. Its comments say

* NB: this does not remove the process' proclock object, nor the lock object,
* even though their counts might now have gone to zero. That will happen
* during a subsequent LockReleaseAll call, which we expect will happen
* during transaction cleanup. (Removal of a proc from its wait queue by
* this routine can only happen if we are aborting the transaction.)

but of course LockReleaseAll is not called until ROLLBACK. I think the
scenario is:

* Query cancel in session 2 kicks the session off session 1's
transaction ID lock, but because of above it leaves a proclock
entry with count zero attached to the lock.

* Rollback in session 1 tries to remove the transaction ID lock,
and gets unhappy because there is still a proclock attached to it.
(A commit in session 1 fails the same way.)

In reality this code has been broken right along, but until 8.0 there
was only a very narrow window for failure --- session 1 would have to
try to release the lock between RemoveFromWaitQueue and LockReleaseAll
in session 2's transaction abort sequence.

ISTM we have to fix RemoveFromWaitQueue to remove the proclock object
immediately if its count has gone to zero. It should be impossible
for the lock's count to have gone to zero (that would imply no one
else holds the lock, so we couldn't be waiting on it) so an Assert
is sufficient for that part.

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-10-2008, 09:37 AM
Tom Lane
 
Posts: n/a
Default Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints

I wrote:
> ISTM we have to fix RemoveFromWaitQueue to remove the proclock object
> immediately if its count has gone to zero. It should be impossible
> for the lock's count to have gone to zero (that would imply no one
> else holds the lock, so we couldn't be waiting on it) so an Assert
> is sufficient for that part.


I've applied a patch along these lines; it seems to make the problem
go away.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:30 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com