vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| When a (application) process doesn't go away after a kill (run as root) and not even after a kill -9, is there any other method to get rid of it besides rebooting? OS is AIX 5.2 ML03. Truss can't attach to the process, the process still has filehandles open to a filesystem Gert |
| |||
| In article <cn0h8k$evp$1@news.cistron.nl>, GertK <mail@invalid.null> wrote: All you can do is reboot! Lovely, ain't it. > When a (application) process doesn't go away after a kill (run as root) > and not even after a kill -9, is there any other method to get rid of it > besides rebooting? OS is AIX 5.2 ML03. > Truss can't attach to the process, the process still has filehandles > open to a filesystem > > Gert |
| |||
| Mike Klein wrote: >In article <cn0h8k$evp$1@news.cistron.nl>, GertK <mail@invalid.null> >wrote: > >All you can do is reboot! Lovely, ain't it. > > Right. It's likely waiting non-interruptable on a kernel event. What kind of application was it? >>When a (application) process doesn't go away after a kill (run as root) >>and not even after a kill -9, is there any other method to get rid of it >>besides rebooting? OS is AIX 5.2 ML03. >>Truss can't attach to the process, the process still has filehandles >>open to a filesystem >> >>Gert >> >> -- Remove the 'NS' from my address to reply. |
| |||
| On 2004-11-11, GertK <mail@invalid.null> wrote: > When a (application) process doesn't go away after a kill (run as root) > and not even after a kill -9, is there any other method to get rid of it > besides rebooting? OS is AIX 5.2 ML03. That depends entirely on what the process is (was) doing, and what sort of process it is. Generally, when a process is not even killable with a SIGKILL ('kill -9'), that process has issued an uninterruptible kernel system call, and that system call is not returning. Whether you can get rid of the process without rebooting depends on which system call the process is hanging in. Rebooting is the easiest and most sure way of getting rid of the process. However, if you know what the process is (was) doing, there are some other things you might try. For example, if the process was writing something to a SCSI-tape, you might try to turn off the particular drive and hope that there's a timeout in the device driver, which will in turn cause the system call your process is hanging in to return and thus the SIGKILL to be delivered, killing the process. Other infamous cases are processes accessing files over an NFS mount when the NFS mount suddenly disappears due to e.g. a network problem. In that last case however, there are mount options you can use to prevent hanging uninterruptable processes. However, always be careful: things like this can also occur with faulty hardware and/or kernel and/or device driver. Fiddling with things might make matters worse. Properly shutting down all normal processes and then reboot is most likely the safest path to follow. -- Jurjen Oskam "I often reflect that if "privileges" had been called "responsibilities" or "duties", I would have saved thousands of hours explaining to people why they were only gonna get them over my dead body." - Lee K. Gleason, VMS sysadmin |
| |||
| > > When a (application) process doesn't go away after a kill (run as root) > > and not even after a kill -9, is there any other method to get rid of it > > besides rebooting? OS is AIX 5.2 ML03. > > Truss can't attach to the process, the process still has filehandles > > open to a filesystem when the process is in kernel space you cannot kill it because it is protected, otherwise you could assert and crash the box. when a proc is doing i/o it context switches into kernel space to the the i/o, you cannot kill it when this happens.. as detailled above a reboot is the answer in this case. You should however work out why this pid is stuck doing i/o ? nfs hard mount and the nfs server gone down ? faulty disk or adapter ? network card issues / packets dropped etc.. if you fix the issue, then you wont get into this situation. Rgds Mark Taylor |
| ||||
| Jurjen Oskam wrote: > Rebooting is the easiest and most sure way of getting rid of the process. > However, if you know what the process is (was) doing, there are some other > things you might try. For example, if the process was writing something > to a SCSI-tape, you might try to turn off the particular drive and hope > that there's a timeout in the device driver, which will in turn cause the > system call your process is hanging in to return and thus the SIGKILL to > be delivered, killing the process. Other infamous cases are processes > accessing files over an NFS mount when the NFS mount suddenly disappears > due to e.g. a network problem. In that last case however, there are mount > options you can use to prevent hanging uninterruptable processes. > > However, always be careful: things like this can also occur with faulty > hardware and/or kernel and/or device driver. Fiddling with things might > make matters worse. Properly shutting down all normal processes and then > reboot is most likely the safest path to follow. > Thanks. Shutdown of apps and after a reboot eveything was fine again. Cause was I/O problems on the san. Process was probably waiting for uninterruptible I/O system call and didn't return. Rgds, Gert |