Unix Technical Forum

WAL replay of truncate fails if the table was dropped

This is a discussion on WAL replay of truncate fails if the table was dropped within the pgsql Bugs forums, part of the PostgreSQL category; --> mdtruncate throws an error if the relation file doesn't exist. However, that's not an error condition if the relation ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Bugs

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 10:58 AM
Heikki Linnakangas
 
Posts: n/a
Default WAL replay of truncate fails if the table was dropped

mdtruncate throws an error if the relation file doesn't exist. However,
that's not an error condition if the relation was dropped later.
Non-existent file should be treated the same as an already truncated
file; we now end up with an unrecoverable database.

This bug seems to be present from 8.0 onwards.

Attached is a test case to reproduce it, along with a patch for CVS
HEAD, and an adapted version of the patch for 8.0-8.2.

Thanks to my colleague Dharmendra Goyal for finding this bug and
constructing an initial test case.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 10:58 AM
Tom Lane
 
Posts: n/a
Default Re: WAL replay of truncate fails if the table was dropped

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> mdtruncate throws an error if the relation file doesn't exist.


Interesting corner case. The proposed fix seems not very consistent
with the way we handle comparable cases elsewhere, though. In general,
md.c will cut some slack when InRecovery if a relation is shorter than
expected, but not if it's not there at all. (This is, indeed, what
justifies mdtruncate's response to file-too-short...) We handle
dropped files during recovery by forced smgrcreate() in places like
XLogOpenRelation. I'm inclined to think smgr_redo should force
smgrcreate() before trying to truncate.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-10-2008, 10:58 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: WAL replay of truncate fails if the table was dropped

Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> mdtruncate throws an error if the relation file doesn't exist.

>
> Interesting corner case. The proposed fix seems not very consistent
> with the way we handle comparable cases elsewhere, though. In general,
> md.c will cut some slack when InRecovery if a relation is shorter than
> expected, but not if it's not there at all. (This is, indeed, what
> justifies mdtruncate's response to file-too-short...) We handle
> dropped files during recovery by forced smgrcreate() in places like
> XLogOpenRelation. I'm inclined to think smgr_redo should force
> smgrcreate() before trying to truncate.


I followed the example of the file-too-short case. Yeah, calling
smgrcreate would work and I can see the justification for that as well.

Interestingly, this bug isn't triggered unless there's an already empty
or uninitialized page at the end of table. If vacuum removes the last
tuple from the page, that will be WAL-logged and replay of that calls
smgrcreate.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-10-2008, 10:58 AM
Tom Lane
 
Posts: n/a
Default Re: WAL replay of truncate fails if the table was dropped

Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Interestingly, this bug isn't triggered unless there's an already empty
> or uninitialized page at the end of table. If vacuum removes the last
> tuple from the page, that will be WAL-logged and replay of that calls
> smgrcreate.


Yeah, I tried other ways to provoke the failure and came to the same
conclusion. The reproducer really is relying on the fact that vacuum's
PageInit of an uninitialized page doesn't get WAL-logged. Which is a
bit nervous-making. As far as I can think at the moment, it won't
provoke any problem because the first subsequent WAL-logged touch of
the page would be an INSERT with the INIT bit set; but it does mean
that a warm-standby slave would be out of sync with the master for an
indefinitely long period with respect to the on-disk contents of such a
page. Does that matter?

Note that we have to fix truncate replay anyway, since you could have
the same failure if a checkpoint happened just before an ordinary
vacuum's truncate. This PageInit behavior merely allows a simpler
reproducer script with no race condition involved.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-10-2008, 10:58 AM
Simon Riggs
 
Posts: n/a
Default Re: WAL replay of truncate fails if the table was dropped

On Fri, 2007-07-20 at 11:38 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
> > Interestingly, this bug isn't triggered unless there's an already empty
> > or uninitialized page at the end of table. If vacuum removes the last
> > tuple from the page, that will be WAL-logged and replay of that calls
> > smgrcreate.

>
> Yeah, I tried other ways to provoke the failure and came to the same
> conclusion. The reproducer really is relying on the fact that vacuum's
> PageInit of an uninitialized page doesn't get WAL-logged. Which is a
> bit nervous-making. As far as I can think at the moment, it won't
> provoke any problem because the first subsequent WAL-logged touch of
> the page would be an INSERT with the INIT bit set; but it does mean
> that a warm-standby slave would be out of sync with the master for an
> indefinitely long period with respect to the on-disk contents of such a
> page. Does that matter?


If I understand this: the primary would be initialised yet the standby
would remain uninitialised? I don't think that matters because the
actual the data contents are still zero.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-10-2008, 10:58 AM
Tom Lane
 
Posts: n/a
Default Re: WAL replay of truncate fails if the table was dropped

"Simon Riggs" <simon@2ndquadrant.com> writes:
> If I understand this: the primary would be initialised yet the standby
> would remain uninitialised? I don't think that matters because the
> actual the data contents are still zero.


From a logical perspective the page is "empty" either way. The only
behavioral difference I can think of is that the initialized page is a
candidate for insertion of new tuples, whereas on the slave it would not
be a candidate until after another VACUUM. So the histories would
diverge faster once the slave comes alive. As long as the slave is just
following WAL records and not making any decisions of its own, I can't
see a failure mode; but it looks like a potential weak spot for future
extensions (particularly, trying to allow slave servers to execute
queries).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:58 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com