Unix Technical Forum

Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"

This is a discussion on Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held" within the pgsql Bugs forums, part of the PostgreSQL category; --> Hi, I hit an issue running PG 8.2.3 with the continuous archiving feature where I was unable to recover ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Bugs

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 07-01-2008, 07:09 AM
John Smith
 
Posts: n/a
Default Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"

Hi,

I hit an issue running PG 8.2.3 with the continuous archiving feature
where I was unable to recover from the backup. I was wondering if
this may be related to bug #3245?

These are the steps that occurred before I saw this problem:

1. Prepare transaction.
2. A base backup of the database was taken to a warm standby system.
3. Commit prepared. The commit prepared never finished as it hit a PANIC:

2008-06-17 23:53:53.206 Local time zone must be set--see zic manual
page PANIC: failed to re-find shared lock object
2008-06-17 23:53:53.207 Local time zone must be set--see zic manual
page STATEMENT: commit prepared '148969' ;


I believe this panic is probably bug #3245 based on the description of
that bug - http://archives.postgresql.org/pgsql...4/msg00075.php

At this point I attempted to do a recovery using the continuous
archive backup on the warm standby system. Instead of recovering
correctly it encountered this FATAL error where a AccessSharedLock was
already held.

2008-06-18 00:05:34.045 Local time zone must be set--see zic manual
page LOG: database system was interrupted at 2008-06-17 23:53:16
Local time zone must be set--see zic manual page
2008-06-18 00:05:34.077 Local time zone must be set--see zic manual
page LOG: checkpoint record is at 70/E600DC18
2008-06-18 00:05:34.077 Local time zone must be set--see zic manual
page LOG: redo record is at 70/E600DC18; undo record is at 0/0;
shutdown FALSE
2008-06-18 00:05:34.077 Local time zone must be set--see zic manual
page LOG: next transaction ID: 0/1099178; next OID: 413234
2008-06-18 00:05:34.077 Local time zone must be set--see zic manual
page LOG: next MultiXactId: 1; next MultiXactOffset: 0
2008-06-18 00:05:34.077 Local time zone must be set--see zic manual
page LOG: database system was not properly shut down; automatic
recovery in progress
2008-06-18 00:05:34.105 Local time zone must be set--see zic manual
page LOG: redo starts at 70/E600DC68
2008-06-18 00:05:34.106 Local time zone must be set--see zic manual
page LOG: could not open file "pg_xlog/0000000100000070000000E7" (log
file 112, segment 231): No such file or directory
2008-06-18 00:05:34.106 Local time zone must be set--see zic manual
page LOG: redo done at 70/E600DC68
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099169
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099156
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099157
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099161
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099164
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099162
2008-06-18 00:05:34.293 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099166
2008-06-18 00:05:34.294 Local time zone must be set--see zic manual
page LOG: recovering prepared transaction 1099131
2008-06-18 00:05:34.298 Local time zone must be set--see zic manual
page FATAL: lock AccessShareLock on object 16477/244169/0 is already
held
2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
page LOG: startup process (PID 17377) exited with exit code 1
2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
page LOG: aborting startup due to startup process failure


Is this FATAL error seen on recovery a different bug or is it just a
direct result of bug #3245?

Unfortunately I do not have a way to deterministically reproduce this
problem but I have seen it 3 times so far.

thanks,

John

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 07-01-2008, 07:09 AM
Tom Lane
 
Posts: n/a
Default Re: Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"

"John Smith" <sodgodofall@gmail.com> writes:
> 2008-06-17 23:53:53.206 Local time zone must be set--see zic manual
> page PANIC: failed to re-find shared lock object
> 2008-06-17 23:53:53.207 Local time zone must be set--see zic manual
> page STATEMENT: commit prepared '148969' ;


> I believe this panic is probably bug #3245 based on the description of
> that bug - http://archives.postgresql.org/pgsql...4/msg00075.php


Yeah, looks like it to me too.

> At this point I attempted to do a recovery using the continuous
> archive backup on the warm standby system. Instead of recovering
> correctly it encountered this FATAL error where a AccessSharedLock was
> already held.
> 2008-06-18 00:05:34.298 Local time zone must be set--see zic manual
> page FATAL: lock AccessShareLock on object 16477/244169/0 is already
> held
> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
> page LOG: startup process (PID 17377) exited with exit code 1
> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
> page LOG: aborting startup due to startup process failure


> Is this FATAL error seen on recovery a different bug or is it just a
> direct result of bug #3245?


It probably is the same bug. The underlying cause of that bug is
explained here:
http://archives.postgresql.org/pgsql...4/msg00129.php
I think what you are seeing is just a variant case caused by the same
lock being written out to the twophase file twice. In any case there's
probably little point in digging further until you've updated to a
version with that fix --- if you still see the problem afterward,
we can look closer.

BTW, what's with the bizarre "Local time zone must be set--see zic
manual" where the timezone should be? Are you intentionally selecting
the "Factory" zone?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 07-01-2008, 07:09 AM
John Smith
 
Posts: n/a
Default Re: Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"

Thanks for the quick reply Tom. I'll be updating my PG version to one
with a fix for bug #3245 so hopefully we won't see this anymore.

> BTW, what's with the bizarre "Local time zone must be set--see zic
> manual" where the timezone should be? Are you intentionally selecting
> the "Factory" zone?


I don't think I've put the correct timezone file in /etc/localtime so
it is using some default file from the Gentoo install.

John



On Mon, Jun 30, 2008 at 12:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "John Smith" <sodgodofall@gmail.com> writes:
>> 2008-06-17 23:53:53.206 Local time zone must be set--see zic manual
>> page PANIC: failed to re-find shared lock object
>> 2008-06-17 23:53:53.207 Local time zone must be set--see zic manual
>> page STATEMENT: commit prepared '148969' ;

>
>> I believe this panic is probably bug #3245 based on the description of
>> that bug - http://archives.postgresql.org/pgsql...4/msg00075.php

>
> Yeah, looks like it to me too.
>
>> At this point I attempted to do a recovery using the continuous
>> archive backup on the warm standby system. Instead of recovering
>> correctly it encountered this FATAL error where a AccessSharedLock was
>> already held.
>> 2008-06-18 00:05:34.298 Local time zone must be set--see zic manual
>> page FATAL: lock AccessShareLock on object 16477/244169/0 is already
>> held
>> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
>> page LOG: startup process (PID 17377) exited with exit code 1
>> 2008-06-18 00:05:34.299 Local time zone must be set--see zic manual
>> page LOG: aborting startup due to startup process failure

>
>> Is this FATAL error seen on recovery a different bug or is it just a
>> direct result of bug #3245?

>
> It probably is the same bug. The underlying cause of that bug is
> explained here:
> http://archives.postgresql.org/pgsql...4/msg00129.php
> I think what you are seeing is just a variant case caused by the same
> lock being written out to the twophase file twice. In any case there's
> probably little point in digging further until you've updated to a
> version with that fix --- if you still see the problem afterward,
> we can look closer.
>
> BTW, what's with the bizarre "Local time zone must be set--see zic
> manual" where the timezone should be? Are you intentionally selecting
> the "Factory" zone?
>
> regards, tom lane
>


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 07-01-2008, 07:09 AM
Tom Lane
 
Posts: n/a
Default Re: Recovery failed on a backup with " lock AccessShareLock on object 16477/244169/0 is already held"

"John Smith" <sodgodofall@gmail.com> writes:
>> BTW, what's with the bizarre "Local time zone must be set--see zic
>> manual" where the timezone should be? Are you intentionally selecting
>> the "Factory" zone?


> I don't think I've put the correct timezone file in /etc/localtime so
> it is using some default file from the Gentoo install.


Ah, yes, I was able to duplicate that behavior by overwriting
/etc/localtime with /usr/share/zoneinfo/Factory. I guess the Gentoo
folks failed in their intention to annoy you enough to make you set
the zone correctly ;-)

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:07 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com