Unix Technical Forum

2PC-induced lockup

This is a discussion on 2PC-induced lockup within the pgsql Hackers forums, part of the PostgreSQL category; --> On Tue, 2007-07-10 at 10:41 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > The following command ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #21 (permalink)  
Old 04-15-2008, 08:57 PM
Simon Riggs
 
Posts: n/a
Default Re: 2PC-induced lockup

On Tue, 2007-07-10 at 10:41 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > The following command sequence appears to lock up the database system:
> > BEGIN;
> > LOCK pg_authid;
> > PREPARE TRANSACTION 'foo';
> > \q

>
> > After that you can't connect anymore, even in single-user mode. The
> > only way I could find is to clear out the pg_twophase directory, but
> > I'm not sure whether it is safe to do that.

>
> > Should this be prevented somehow, and is there a better recovery path?

>
> AFAICS this is just one of many ways in which a superuser can shoot
> himself in the foot; I'm not eager to try to prevent it.
>
> Right offhand, clearing pg_twophase while the system is stopped should
> be safe enough.


Safe from the perspective of the rest of the system. The prepared
transactions will clearly be lost and that might be worth millions.

I'm concerned that this advice will lead to clearing pg_twophase every
time that the system won't start properly.

I'd be much more comfortable if LOCK TABLE caused a message to the log
if it is executed on any system table. I can't really see a reason to
allow a user the ability to explicitly lock out a system table and would
prefer if that were banned completely. It's DoS if nothing else. A
simple check on LOCK TABLE won't cost much in the normal execution path.

There seems like a number of ways that unresolved prepared transactions
can cause problems. We really need to have startup mention how many
prepared transactions there are, so we have some chance of understanding
and resolving potential problems. Without such a message we might well
experience downtimes of many hours before somebody thinks to check
pg_twophase and that runs against our goal of higher availability.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #22 (permalink)  
Old 04-15-2008, 08:57 PM
Simon Riggs
 
Posts: n/a
Default Re: 2PC-induced lockup

On Wed, 2007-07-11 at 22:33 +0100, Heikki Linnakangas wrote:
> Andrew Sullivan wrote:
> > On Wed, Jul 11, 2007 at 09:26:34PM +0100, Heikki Linnakangas wrote:
> >> The xid is encoded in the filename. If you can't start up the database
> >> and look at pg_locks, you can't do much other than guess.

> >
> > So then in this sort of case, it isn't _really_ safe to delete those
> > files, because the commitment you made before crash when you accepted
> > a PREPARE TRANSACTION is going to be gone, which violates the 2PC
> > rules.

>
> Yes, though if neither you nor the transaction manager can connect, you
> don't have much choice.


True, but I'm worried that this discussion will lead, via Google, to the
impression that if you are having connection problems the best thing to
do is to delete everything in pg_twophase. There are hundreds of other
issues that might prevent connection and it would require significant
expertise to isolate this as the error. I would prefer to explicitly
avoid this kind of error, so that we can return to the idea that
removing pg_twophase is never a requirement.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #23 (permalink)  
Old 04-15-2008, 08:57 PM
Tom Lane
 
Posts: n/a
Default Re: 2PC-induced lockup

"Simon Riggs" <simon@2ndquadrant.com> writes:
> I'd be much more comfortable if LOCK TABLE caused a message to the log
> if it is executed on any system table.


Enabled by "set training_wheels = on", perhaps?

This is really pretty silly to be getting worked up about. The command
in question wouldn't have been allowed at all except to a superuser,
and there are plenty of ways to catastrophically destroy your database
when you are superuser; most of which we will never consider blocking
for the same reasons that Unix systems have never tried to block root
from doing "rm -rf /". I'd say the real design flaw in Peter's
referenced application is that they're running it as superuser.

> There seems like a number of ways that unresolved prepared transactions
> can cause problems. We really need to have startup mention how many
> prepared transactions there are, so we have some chance of understanding
> and resolving potential problems.


While I have no particular objection to such a log entry, I doubt it
will fix anything; how many people will really think to look in the
postmaster log? In any case, most of the problems I've personally run
into with prepared xacts have nothing to do with crashes and so nothing
like that would ever get emitted. (The typical way I get bitten is to
interrupt the regression tests because I changed my mind about
something, and manage to do this just while the prepared_xacts test has
some open prepared xacts.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #24 (permalink)  
Old 04-15-2008, 08:58 PM
Florian G. Pflug
 
Posts: n/a
Default Re: 2PC-induced lockup

Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> I'd be much more comfortable if LOCK TABLE caused a message to the log
>> if it is executed on any system table.

>
> Enabled by "set training_wheels = on", perhaps?
>
> This is really pretty silly to be getting worked up about. The command
> in question wouldn't have been allowed at all except to a superuser,
> and there are plenty of ways to catastrophically destroy your database
> when you are superuser; most of which we will never consider blocking
> for the same reasons that Unix systems have never tried to block root
> from doing "rm -rf /". I'd say the real design flaw in Peter's
> referenced application is that they're running it as superuser.


Yeah.. though "lock pg_auth; prepare" looks quite innocent, much more
than say "delete from pg_database" or "rm -rf whatever".
At least to the untrained eye.

I fully agree that that special-casing this particular way to shoot yourself
in the foot is not worth it - but maybe pursuing a more general solution
would be worthwile? Maybe superuser-connections could e.g. ignore
any errors that occur while reading a system table, together with
a big, fat warning, but still allow a logon? That of course depends
on the assumption that basic authentication is possible using just
the information from the flatfiles and pg_hba.conf, which I'm not
sure about.

greetings, Florian Pflug


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #25 (permalink)  
Old 04-15-2008, 08:58 PM
Simon Riggs
 
Posts: n/a
Default Re: 2PC-induced lockup

On Wed, 2007-07-11 at 18:09 -0400, Tom Lane wrote:

> > There seems like a number of ways that unresolved prepared transactions
> > can cause problems. We really need to have startup mention how many
> > prepared transactions there are, so we have some chance of understanding
> > and resolving potential problems.

>
> While I have no particular objection to such a log entry, I doubt it
> will fix anything; how many people will really think to look in the
> postmaster log?


Even if it were just you and me. From my perspective, thats enough.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #26 (permalink)  
Old 04-15-2008, 08:58 PM
Robert Treat
 
Posts: n/a
Default Re: 2PC-induced lockup

On Thursday 12 July 2007 04:19, Simon Riggs wrote:
> On Wed, 2007-07-11 at 18:09 -0400, Tom Lane wrote:
> > > There seems like a number of ways that unresolved prepared transactions
> > > can cause problems. We really need to have startup mention how many
> > > prepared transactions there are, so we have some chance of
> > > understanding and resolving potential problems.

> >
> > While I have no particular objection to such a log entry, I doubt it
> > will fix anything; how many people will really think to look in the
> > postmaster log?

>
> Even if it were just you and me. From my perspective, thats enough.


Well, Tom doesn't look at the log files, so I guess your idea is shot...

:-)
--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #27 (permalink)  
Old 04-15-2008, 08:58 PM
Michael Paesold
 
Posts: n/a
Default Re: 2PC-induced lockup

Simon Riggs wrote:
> On Wed, 2007-07-11 at 18:09 -0400, Tom Lane wrote:
>
>>> There seems like a number of ways that unresolved prepared transactions
>>> can cause problems. We really need to have startup mention how many
>>> prepared transactions there are, so we have some chance of understanding
>>> and resolving potential problems.

>> While I have no particular objection to such a log entry, I doubt it
>> will fix anything; how many people will really think to look in the
>> postmaster log?

>
> Even if it were just you and me. From my perspective, thats enough.


At least, such a message seems much more useful than the list of
historic startup messages that were removed recently. Just my two ¤-cents.

Best Regards
Michael Paesold


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #28 (permalink)  
Old 04-15-2008, 08:58 PM
Andrew Sullivan
 
Posts: n/a
Default Re: 2PC-induced lockup

On Wed, Jul 11, 2007 at 10:47:25PM +0100, Simon Riggs wrote:
> expertise to isolate this as the error. I would prefer to explicitly
> avoid this kind of error, so that we can return to the idea that
> removing pg_twophase is never a requirement.


This was pretty much my point. It's one thing to say, "If you are
completely hosed, you will lose some data." But 2PC is making some
pretty strong promises, and I sort of hate it that it's not real hard
to break things in such a way that those promises have to be broken.

A

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #29 (permalink)  
Old 04-15-2008, 08:58 PM
Andrew Sullivan
 
Posts: n/a
Default Re: 2PC-induced lockup

On Wed, Jul 11, 2007 at 06:09:55PM -0400, Tom Lane wrote:
> This is really pretty silly to be getting worked up about. The command
> in question wouldn't have been allowed at all except to a superuser,
> and there are plenty of ways to catastrophically destroy your database
> when you are superuser; most of which we will never consider blocking


I think the problem is it is nowise obvious that LOCK [sometable];
PREPARE TRANSACTION is, as superuser, dangerous in the way that DROP
CASCADE is. Or rm -rf, for that matter. It seems to me that at the
very least, some pretty extensive documentation of the perils of
running 2PC as a superuser ought to be available, and if it's there,
it didn't leap out at me.

But the other problem I see here is that the solution hits more than
just the problematic state. If we have bad pages on disk, for
instance, we zero pages; we don't drop the table. Similarly, it
seems that all that's necessary here is an external tool to grovel
through the prepared transaction files and somehow figure out what
locks, if any, they entail. Wouldn't that be enough to then allow
you to shoot the relevant file, thereby losing only that prepared
transaction instead of all of them? (I'm not suggesting that no data
must ever be lost in this case; just that we should lose the minimum
necessary to make the system work.)

A

--
Andrew Sullivan | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
--Alexander Hamilton

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #30 (permalink)  
Old 04-15-2008, 08:59 PM
Tom Lane
 
Posts: n/a
Default Re: 2PC-induced lockup

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> But the other problem I see here is that the solution hits more than
> just the problematic state. If we have bad pages on disk, for
> instance, we zero pages; we don't drop the table. Similarly, it
> seems that all that's necessary here is an external tool to grovel
> through the prepared transaction files and somehow figure out what
> locks, if any, they entail.


Something equivalent to pg_filedump or xlogdump for 2PC state files
would be worth having for development/debugging purposes, quite aside
from any possible usefulness for getting out of this type of
predicament. Any volunteers out there to write/maintain such a utility?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:36 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com