This is a discussion on Re: [GENERAL] Shutting down a warm standby database in 8.2beta3 within the pgsql Hackers forums, part of the PostgreSQL category; --> Stephen Harris <lists@spuddy.org> writes: > Doing a shutdown "immediate" isn't to clever because it actually leaves > the recovery ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Stephen Harris <lists@spuddy.org> writes: > Doing a shutdown "immediate" isn't to clever because it actually leaves > the recovery threads running > LOG: restored log file "00000001000000010000003E" from archive > LOG: received immediate shutdown request > LOG: restored log file "00000001000000010000003F" from archive Hm, that should work --- AFAICS the startup process should abort on SIGQUIT the same as any regular backend. [ thinks... ] Ah-hah, "man system(3)" tells the tale: system() ignores the SIGINT and SIGQUIT signals, and blocks the SIGCHLD signal, while waiting for the command to terminate. If this might cause the application to miss a signal that would have killed it, the application should examine the return value from system() and take whatever action is appropriate to the application if the command terminated due to receipt of a signal. So the SIGQUIT went to the recovery script command and was missed by the startup process. It looks to me like your script actually ignored the signal, which you'll need to fix, but it also looks like we are not checking for these cases in RestoreArchivedFile(), which we'd better fix. As the code stands, if the recovery script is killed by a signal, we'd take that as normal termination of the recovery and proceed to come up, which is definitely the Wrong Thing. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Fri, Nov 17, 2006 at 05:03:44PM -0500, Tom Lane wrote: > Stephen Harris <lists@spuddy.org> writes: > > Doing a shutdown "immediate" isn't to clever because it actually leaves > > the recovery threads running > > > LOG: restored log file "00000001000000010000003E" from archive > > LOG: received immediate shutdown request > > LOG: restored log file "00000001000000010000003F" from archive > > Hm, that should work --- AFAICS the startup process should abort on > SIGQUIT the same as any regular backend. > > [ thinks... ] Ah-hah, "man system(3)" tells the tale: > > system() ignores the SIGINT and SIGQUIT signals, and blocks the > SIGCHLD signal, while waiting for the command to terminate. If this > might cause the application to miss a signal that would have killed > it, the application should examine the return value from system() and > take whatever action is appropriate to the application if the command > terminated due to receipt of a signal. > > So the SIGQUIT went to the recovery script command and was missed by the > startup process. It looks to me like your script actually ignored the > signal, which you'll need to fix, but it also looks like we are not My script was just a ksh script and didn't do anything special with signals. Essentially it does #!/bin/ksh -p [...variable setup...] while [ ! -f $wanted_file ] do if [ -f $abort_file ] then exit 1 fi sleep 5 done cat $wanted_file I know signals can be deferred in scripts (a signal sent to the script during the sleep will be deferred if a trap handler had been written for the signal) but they _do_ get delivered. However, it seems the signal wasn't sent at all. Once the wanted file appeared the recovery thread from postmaster started a _new_ script for the next log. I'll rewrite the script in perl (probably monday when I'm back in the office) and stick lots of signal() traps in to see if anything does get sent to the script. > As the code stands, if the recovery script is killed by a signal, we'd > take that as normal termination of the recovery and proceed to come up, > which is definitely the Wrong Thing. Oh good; that means I'm not mad :-) -- rgds Stephen ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| "Stephen Harris" <lists@spuddy.org> writes: > My script was just a ksh script and didn't do anything special with signals. > Essentially it does > #!/bin/ksh -p > > [...variable setup...] > while [ ! -f $wanted_file ] > do > if [ -f $abort_file ] > then > exit 1 > fi > sleep 5 > done > cat $wanted_file > > I know signals can be deferred in scripts (a signal sent to the script during > the sleep will be deferred if a trap handler had been written for the signal) > but they _do_ get delivered. Sure, but it might be getting delivered to, say, your "sleep" command. You haven't checked the return value of sleep to handle any errors that may occur. As it stands you have to check for errors from every single command executed by your script. That doesn't seem terribly practical to expect of useres. As long as Postgres is using SIGQUIT for its own communication it seems it really ought to arrange to block the signal while the script is running so it will receive the signals it expects once the script ends. Alternatively perhaps Postgres really ought to be using USR1/USR2 or other signals that library routines won't think they have any business rearranging. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Stephen Harris <lists@spuddy.org> writes: > However, it seems the signal wasn't sent at all. Now that I think about it, the behavior of system() is predicated on the assumption that SIGINT and SIGQUIT originate with the tty driver and are broadcast to all members of the session's process group --- so the called command will get them too, and there's no need for system() to do anything except wait to see whether the called command dies or traps the signal. This does not apply to signals originated by the postmaster --- it doesn't even know that the child process is doing a system(), much less have any way to signal the grandchild. Ugh. Reimplementing system() seems pretty ugly, but maybe we have no choice. It strikes me that system() has a race condition as defined anyway, because if a signal arrives between blocking the handler and issuing the fork(), it'll disappear into the ether; and the same at the end of the routine. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Fri, Nov 17, 2006 at 09:39:39PM -0500, Gregory Stark wrote: > "Stephen Harris" <lists@spuddy.org> writes: > > [...variable setup...] > > while [ ! -f $wanted_file ] > > do > > if [ -f $abort_file ] > > then > > exit 1 > > fi > > sleep 5 > > done > > cat $wanted_file > > I know signals can be deferred in scripts (a signal sent to the script during > Sure, but it might be getting delivered to, say, your "sleep" command. You No. The sleep command keeps on running. I could see that using "ps". To the best of my knowldge, a random child process of the script wouldn't even get a signal. All the postmaster recovery thread knows about is the system() - ie "sh -c". All sh knows about is the ksh process. Neither postmaster or sh know about "sleep" and so "sleep" wouldn't receive the signal (unless it was sent to all processes in the process group). Here's an example from Solaris 10 demonstrating lack of signal propogation. $ uname -sr SunOS 5.10 $ echo $0 /bin/sh $ cat x #!/bin/ksh -p sleep 10000 $ ./x & 4622 $ kill 4622 $ 4622 Terminated $ ps -ef | grep sleep sweh 4624 4602 0 22:13:13 pts/1 0:00 grep sleep sweh 4623 1 0 22:13:04 pts/1 0:00 sleep 10000 This is, in fact, what proper "job control" shells do. Doing the same test with ksh as the command shell will kill the sleep :-) $ echo $0 -ksh $ ./x & [1] 4632 $ kill %1 [1] + Terminated ./x & $ ps -ef | grep sleep sweh 4635 4582 0 22:15:17 pts/1 0:00 grep sleep [ Aside: The only way I've been able to guarantee all processes and child processes and everything to be killed is to run a subprocess with setsid() to create a new process group and kill the whole process group. It's a pain ] If postmaster was sending a signal to the system() process then "sh -c" might not signal the ksh script, anyway. The ksh script might terminate, or it might defer until sleep had finished. Only if postmaster had signalled a complete process group would sleep ever see the signal. -- rgds Stephen ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Fri, Nov 17, 2006 at 10:49:39PM -0500, Tom Lane wrote: > Stephen Harris <lists@spuddy.org> writes: > > However, it seems the signal wasn't sent at all. > > Now that I think about it, the behavior of system() is predicated on the > assumption that SIGINT and SIGQUIT originate with the tty driver and are > broadcast to all members of the session's process group --- so the > This does not apply to signals originated by the postmaster --- it > doesn't even know that the child process is doing a system(), much less > have any way to signal the grandchild. Ugh. Why not, after calling fork() create a new process group with setsid() and then instead of killing the recovery thread, kill the whole process group (-PID rather than PID)? Then every process (the recovery thread, the system, the script, any child of the script) will all receive the signal. -- rgds Stephen ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Stephen Harris <lists@spuddy.org> writes: > On Fri, Nov 17, 2006 at 10:49:39PM -0500, Tom Lane wrote: >> This does not apply to signals originated by the postmaster --- it >> doesn't even know that the child process is doing a system(), much less >> have any way to signal the grandchild. Ugh. > Why not, after calling fork() create a new process group with setsid() and > then instead of killing the recovery thread, kill the whole process group > (-PID rather than PID)? Then every process (the recovery thread, the > system, the script, any child of the script) will all receive the signal. This seems like a good answer if setsid and/or setpgrp are universally available. I fear it won't work on Windows though :-(. Also, each backend would become its own process group leader --- does anyone know if adding hundreds of process groups would slow down any popular kernels? [ thinks for a bit... ] Another issue is that there'd be a race condition during backend start: if the postmaster tries to kill -PID before the backend has managed to execute setsid, it wouldn't work. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Gregory Stark <stark@enterprisedb.com> writes: > Sure, but it might be getting delivered to, say, your "sleep" command. You > haven't checked the return value of sleep to handle any errors that may occur. > As it stands you have to check for errors from every single command executed > by your script. The expectation is that something like SIGINT or SIGQUIT would be delivered to both the sleep command and the shell process running the script. So the shell should fail anyway. (Of course, a nontrivial archive or recovery script had better be checking for failures at each step, but this is not very relevant to the immediate problem.) > Alternatively perhaps Postgres really ought to be using USR1/USR2 or other > signals that library routines won't think they have any business rearranging. The existing signal assignments were all picked for what seem to me to be good reasons; I'm disinclined to change them. In any case, the important point here is that we'd really like an archive or recovery script, or for that matter any command executed via system() from a backend, to abort when the parent backend is SIGINT'd or SIGQUIT'd. Stephen's idea of executing setsid() at each backend start seems interesting, but is there a way that will work on Windows? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| "Tom Lane" <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> Sure, but it might be getting delivered to, say, your "sleep" command. You >> haven't checked the return value of sleep to handle any errors that may occur. >> As it stands you have to check for errors from every single command executed >> by your script. > > The expectation is that something like SIGINT or SIGQUIT would be > delivered to both the sleep command and the shell process running the > script. So the shell should fail anyway. (Of course, a nontrivial > archive or recovery script had better be checking for failures at each > step, but this is not very relevant to the immediate problem.) Hm, I tried to test that before I sent that. But I guess my test was faulty since I was really testing what process the terminal handling delivered the signal to: $ cat /tmp/test.sh #!/bin/sh echo before sleep 5 || echo sleep failed echo after $ sh /tmp/test.sh ; echo $? before ^\ /tmp/test.sh: line 4: 23407 Quit sleep 5 sleep failed after 0 -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| ||||
| Gregory Stark <stark@enterprisedb.com> writes: > Hm, I tried to test that before I sent that. But I guess my test was faulty > since I was really testing what process the terminal handling delivered the > signal to: Interesting. I tried the same test on HPUX, and find that its /bin/sh seems to ignore SIGQUIT but not SIGINT: $ sh /tmp/test.sh ; echo $? before -- typed ^C here 130 $ sh /tmp/test.sh ; echo $? before -- typed ^\ here /tmp/test.sh[4]: 25166 Quit(coredump) sleep failed after 0 $ There is nothing in the shell man page about this :-( That seems to leave us back at square one. How can we ensure an archive or recovery script will fail on being signaled? (Obviously we can't prevent someone from trapping the signal, but it'd be good if the default behavior was this way.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |