Re: kill -0 may fail Actually, we don't use the kill subroutine but the shell command (which of course use
the subroutine). What I can say is :
+ The target process is not a zombie, it is alive and works
+ the "kill -0" is launched with the root identity
I rather think that there is a critical section problem in the kernel which would
make believe that the process disappeared while there are some creation/deletion of
other processes. I suppose this because a supervisor program based on the subroutine
"getproc()" has the same problem. It is very rare and I have the impression it
appears when lots of creation/deletion of processes occurs. That's the way I
reproduce it during a 24 hours endurance.
Regards.
Villy Kruse wrote:
> On Wed, 14 Jul 2004 23:40:55 -0400,
> Rachid Koucha <koucha@europem01.nt.com> wrote:
>
> > Hi,
> >
> > On a AIX 4.3.2 system, we supervise some processes by using "kill -0 pid". It
> > appears that sometimes "kill -0" returns an error indicating that the process no
> > more exists but actually the process still exists. I noticed the same problem
> > with a utility using the service getproc().
> > Is it a known problem ?
> >
>
> The kill() syscall can return an error for two reasons:
> 1) the process doesn exist with errno = ESRCH
> 2) tht process doesn't have the same user id with errno = EPERM. An EPERM
> error is a positive confirmation that the process does exist.
>
> You have to be superuser to be sure that an error won't be an EPERM error.
>
> Villy |