vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, On a AIX 4.3.2 system, we supervise some processes by using "kill -0 pid". It appears that sometimes "kill -0" returns an error indicating that the process no more exists but actually the process still exists. I noticed the same problem with a utility using the service getproc(). Is it a known problem ? Regards. |
| |||
| On Wed, 14 Jul 2004 23:40:55 -0400, Rachid Koucha <koucha@europem01.nt.com> wrote: > Hi, > > On a AIX 4.3.2 system, we supervise some processes by using "kill -0 pid". It > appears that sometimes "kill -0" returns an error indicating that the process no > more exists but actually the process still exists. I noticed the same problem > with a utility using the service getproc(). > Is it a known problem ? > The kill() syscall can return an error for two reasons: 1) the process doesn exist with errno = ESRCH 2) tht process doesn't have the same user id with errno = EPERM. An EPERM error is a positive confirmation that the process does exist. You have to be superuser to be sure that an error won't be an EPERM error. Villy |
| |||
| Rachid Koucha <koucha@europem01.nt.com> wrote in message news:<40F5FCC7.5CEB3CB7@europem01.nt.com>... > Hi, > > On a AIX 4.3.2 system, we supervise some processes by using "kill -0 pid". It > appears that sometimes "kill -0" returns an error indicating that the process no > more exists but actually the process still exists. I noticed the same problem > with a utility using the service getproc(). > Is it a known problem ? > > Regards. Sounds like a zombie process. |
| ||||
| Actually, we don't use the kill subroutine but the shell command (which of course use the subroutine). What I can say is : + The target process is not a zombie, it is alive and works + the "kill -0" is launched with the root identity I rather think that there is a critical section problem in the kernel which would make believe that the process disappeared while there are some creation/deletion of other processes. I suppose this because a supervisor program based on the subroutine "getproc()" has the same problem. It is very rare and I have the impression it appears when lots of creation/deletion of processes occurs. That's the way I reproduce it during a 24 hours endurance. Regards. Villy Kruse wrote: > On Wed, 14 Jul 2004 23:40:55 -0400, > Rachid Koucha <koucha@europem01.nt.com> wrote: > > > Hi, > > > > On a AIX 4.3.2 system, we supervise some processes by using "kill -0 pid". It > > appears that sometimes "kill -0" returns an error indicating that the process no > > more exists but actually the process still exists. I noticed the same problem > > with a utility using the service getproc(). > > Is it a known problem ? > > > > The kill() syscall can return an error for two reasons: > 1) the process doesn exist with errno = ESRCH > 2) tht process doesn't have the same user id with errno = EPERM. An EPERM > error is a positive confirmation that the process does exist. > > You have to be superuser to be sure that an error won't be an EPERM error. > > Villy |