Unix Technical Forum

Not your father's question about deadlocks

This is a discussion on Not your father's question about deadlocks within the Pgsql General forums, part of the PostgreSQL category; --> Once upon a time, I put a question regarding deadlocks to the group, and Tom Lane immediately answered with ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql General

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-09-2008, 12:21 PM
Clarence Gardner
 
Posts: n/a
Default Not your father's question about deadlocks

Once upon a time, I put a question regarding deadlocks to the group,
and Tom Lane immediately answered with this:

>The guy waiting on the tuple-specific lock is second in
>line to actually mung the tuple. Whoever is first in line behind the
>current tenant will be blocked trying to acquire ShareLock on the
>current tenant's transaction ID.
>
>What you appear to have is a situation where two transactions are trying
>to lock or update the same two rows in different orders. Without a lot
>more info about your application logic, I couldn't guess why this seems
>to be associated with having more than two transaction interested in the
>same tuple.
>
>Note that the guy looking for ShareLock on the tuple is evidently either
>doing SELECT FOR SHARE on this tuple, or trying to install a new tuple
>referencing this one as a foreign key (which does SELECT FOR SHARE under
>the hood). But he's blocked by someone who's done either SELECT FOR
>UPDATE or an actual UPDATE on that tuple.


I'm still creeping up on the problem. I configured postgres to resolve
a deadlock only after a month, and now I seem to have a complete
snapshot of one of them. The details are below. The part that doesn't
fit with what Tom says is that there are only two transactions involved.
The first and third "guys" are the same guy!

The first one, I'll call it X71, is the one doing the insert. The
LedgerDetail table references various other tables; the one of interest
is the buys table, and it is inserting a record referencing buy #955. It
had done several other inserts before this, some of them also referencing
buy #955. (In fact, the immediately preceding statement was one such.)
This is evidenced by the RowShareLock that it holds on the buys table.

The other one, X78, is executing a stored procedure which has only one
reference to buys, this statement:
update buys set budget=budget+remainder where id=buynum;

So the timeline is this
X71 X78
----------------------- ----------------------
insert buy955 reference
insert buy955 reference
insert buy955 reference
update buy955 (blocks)
insert buy955 reference (blocks)

Here is the state of the system. I have edited it somewhat. The pg_locks
table actually shows five waiters; the three that I've deleted started
much later than when the deadlock happened, so I've deleted all the pg_locks
rows that belonged to their transactions. I've also deleted all pg_locks
rows for X71 and X78 that referred to relations that the other transaction
had no lock on. And I've deleted empty or constant columns, and the relation
oid column.
================================================== ====================
27478 | insert into LedgerDetail
(ledger,strategy,advId,campaignId,buyId,creativeId ,siteId,providerType,adspotId,checkbookId,count,dr ,cr,dr_usd,cr_usd,time)
values('imp',',61144,',533,676,955,4076,2,5,656,31 275,14,0,110936,0,142142,1163530800)
| 2006-11-14 19:22:43.760703+00 | 2006-11-14 19:22:29.342683+00
99056 | FETCH 1 FROM "PgSQL_0CDD618C"
| 2006-11-14 19:22:43.748025+00 | 2006-11-13 19:21:40.038948+00

2006-11-14 11:22:43 PST[99056]LOG: statement: DECLARE "PgSQL_0CDD618C" CURSOR FOR
select repayBuyBudget(955, 31275, fromusd(ecc2usd(7790), 'EUR'))

relation | locktype | page | tuple | transactionid | transaction | pid | mode | granted
---------------------------+---------------+-------+-------+---------------+--------------+-------+------------------+---------
buys | relation | | | | 101953371 | 27478 | AccessShareLock | t
buys | relation | | | | 101953371 | 27478 | RowShareLock | t
buys | tuple | 38 | 11 | | 101953371 | 27478 | ShareLock | f
buys_pkey | relation | | | | 101953371 | 27478 | AccessShareLock | t

| transactionid | | | 101953371 | 101954678 | 99056 | ShareLock | f
buys | relation | | | | 101954678 | 99056 | AccessShareLock | t
buys | relation | | | | 101954678 | 99056 | RowExclusiveLock | t
buys | tuple | 38 | 11 | | 101954678 | 99056 | ExclusiveLock | t
buys_pkey | relation | | | | 101954678 | 99056 | AccessShareLock | t
buys_pkey | relation | | | | 101954678 | 99056 | RowExclusiveLock | t
================================================== ====================

That scenario seems quite simple, but I can't reproduce the deadlock with
this seemingly-identical sequence. The "Sn" prompts show the sequence of
statements done from psql in two terminal sessions.
S1 => create table t1(f1 int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
CREATE TABLE
S2 => create table t2(f1 int references t1);
CREATE TABLE
S3 => insert into t1 values(1);
INSERT 0 1
S4 => begin;
BEGIN
S7 => update t1 set f1=1; (blocks)

S5 => begin;
BEGIN
S6 => insert into t2 values(1);
INSERT 0 1
S8 => insert into t2 values(1);
INSERT 0 1

S8 is the statement that seems analagous to the one that blocks in the
real system, but it doesn't here.

Apart from that test being different somehow that I can't figure out,
situation on the real system seems to suggest that I need to either
lock all the reference tables at the beginning of X71 (uncomfortable),
or have it commit after each insert (unacceptable).

Thanks in advance for your help.


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-09-2008, 12:24 PM
Tom Lane
 
Posts: n/a
Default Re: [HACKERS] Not your father's question about deadlocks

"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> On 11/17/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> we need a special case when we are already a member of the MultiXact:
>> fall through without trying to reacquire the tuple lock.


> Small implementation detail: Also keep a count of how many times the same
> session requested the same lock, and do not release the lock until he
> requests same number of releases.


No need for that, because there isn't any heap_unlock_tuple.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:50 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com