This is a discussion on elog(FATAL) vs shared memory within the pgsql Hackers forums, part of the PostgreSQL category; --> Jim Nasby wrote: > On Apr 11, 2007, at 6:23 PM, Jim Nasby wrote: >> FWIW, you might want ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Jim Nasby wrote: > On Apr 11, 2007, at 6:23 PM, Jim Nasby wrote: >> FWIW, you might want to put some safeguards in there so that you don't >> try to inadvertently kill the backend that's running that function... >> unfortunately I don't think there's a built-in function to tell you >> the PID of the backend you're connected to; if you're connecting via >> TCP you could use inet_client_addr() and inet_client_port(), but that >> won't work if you're using the socket to connect. > > *wipes egg off face* > > There is a pg_backend_pid() function, even if it's not documented with > the other functions (it's in the stats function stuff for some reason). eh. No worries - my safeguard is just a comment saying 'don't connect to the same database you are killing the connections of' :-) -- Stuart Bishop <stuart.bishop@canonical.com> http://www.canonical.com/ Canonical Ltd. http://www.ubuntu.com/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGHjDQAfqZj7rGN0oRAkJ0AJwL7t93/4Yu14zp5/hpdqD2qysrjACeLm0w 24Z7b0b4pf8gN7S4LFfdYyE= =cFgg -----END PGP SIGNATURE----- |
| |||
| Where are we on this? --------------------------------------------------------------------------- Tom Lane wrote: > In this thread: > http://archives.postgresql.org/pgsql...3/msg00145.php > we eventually determined that the reported lockup had three components: > > (1) something (still not sure what --- Martin and Mark, I'd really like > to know) was issuing random SIGTERMs to various postgres processes > including autovacuum. > > (2) if a SIGTERM happens to arrive while btbulkdelete is running, > the next CHECK_FOR_INTERRUPTS will do elog(FATAL), causing elog.c > to do proc_exit(0), leaving the vacuum still recorded as active in > the shared memory array maintained by _bt_start_vacuum/_bt_end_vacuum. > The PG_TRY block in btbulkdelete doesn't get a chance to clean up. > > (3) eventually, either we try to re-vacuum the same index or > accumulation of bogus active entries overflows the array. > Either way, _bt_start_vacuum throws an error, which btbulkdelete > PG_CATCHes, leading to_bt_end_vacuum trying to re-acquire the LWLock > already taken by _bt_start_vacuum, meaning that the process hangs up. > And then so does anything else that needs to take that LWLock... > > Point (3) is already fixed in CVS, but point (2) is a lot nastier. > What it essentially says is that trying to clean up shared-memory > state in a PG_TRY block is unsafe: you can't be certain you'll > get to do it. Now this is not a big deal during normal SIGTERM or > SIGQUIT database shutdown, because we're going to abandon the shared > memory segment anyway. However, if we ever want to support individual > session kill via SIGTERM, it's a problem. Even if we were not > interested in someday considering that a supported feature, it seems > that dealing with random SIGTERMs is needed for robustness in at least > some environments. > > AFAICS, there are basically two ways we might try to approach this: > > Plan A: establish the rule that you mustn't try to clean up shared > memory state in a PG_CATCH block. Anything you need to do like that > has to be handled by an on_shmem_exit hook function, so it will be > called during a FATAL exit. (Or maybe you can do it in PG_CATCH for > normal ERROR cases, but you need a backing on_shmem_exit hook to > clean up for FATAL.) > > Plan B: change the handling of FATAL errors so that they are thrown > like normal errors, and the proc_exit call happens only when we get > out to the outermost control level in postgres.c. This would mean > that PG_CATCH blocks get a chance to clean up before the FATAL exit > happens. The problem with that is that a non-cooperative PG_CATCH > block might think it could "recover" from the error, and then the exit > does not happen at all. We'd need a coding rule that PG_CATCH blocks > *must* re-throw FATAL errors, which seems at least as ugly as Plan A. > In particular, all three of the external-interpreter PLs are willing > to return errors into the external interpreter, and AFAICS we'd be > entirely at the mercy of the user-written Perl or Python or Tcl code > whether it re-throws the error or not. > > So Plan B seems unacceptably fragile. Does anyone see a way to fix it, > or perhaps a Plan C with a totally different idea? Plan A seems pretty > ugly but it's the best I can come up with. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Bruce Momjian <bruce@momjian.us> writes: > Where are we on this? Still trying to think of a less messy solution... >> What it essentially says is that trying to clean up shared-memory >> state in a PG_TRY block is unsafe: you can't be certain you'll >> get to do it. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Where are we on this? > > Still trying to think of a less messy solution... OK, put in the patches hold queue for 8.4. --------------------------------------------------------------------------- > > >> What it essentially says is that trying to clean up shared-memory > >> state in a PG_TRY block is unsafe: you can't be certain you'll > >> get to do it. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| Thread Tools | |
| Display Modes | |
|
|