This is a discussion on pthread_create returns EAGAIN without WLM within the AIX Operating System forums, part of the Unix Operating Systems category; --> I have a multithreaded application running on 5.2.0. We're getting EAGAIN from pthread_create after a certain number of threads ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have a multithreaded application running on 5.2.0. We're getting EAGAIN from pthread_create after a certain number of threads have been created. I'm assuming we're running into a heap constraint, and setting a larger heap (with -bmaxdata or the equivalent LDR_CNTRL environment setting) would probably correct the problem, but according to a message Gary Hook posted to c.u.a last year, it should only be possible to get EAGAIN if WLM is running.[1] And it's not, according to wlmcntrl(1) - as I'd expect, since no one should be using it on this particular test system. Does anyone know what the real story is with pthread_create and EAGAIN? If it matters, this is a dual-CPU system (uname -M says "7028-6C4") with ample physical and virtual memory. errpt doesn't show any errors when the pthread_create error occurs. The process is running under a non-superuser ID, but it's started by a UID-0 process (the children call setuid after the fork) which in turn was started in a shell with most resource limits set to "unlimited" via ulimit. (That's why I'm guessing this is a heap size issue - we shouldn't be hitting regular setrlimit-style limits.) 1. http://groups.google.com/groups?selm...nospammers.net -- Michael Wojcik michael.wojcik@microfocus.com I would never understand our engineer. But is there anything in this world that *isn't* made out of words? -- Tawada Yoko (trans. Margaret Mitsutani) |
| |||
| Michael Wojcik wrote: > I have a multithreaded application running on 5.2.0. We're getting > EAGAIN from pthread_create after a certain number of threads have > been created. > I'm assuming we're running into a heap constraint, and > setting a larger heap (with -bmaxdata or the equivalent LDR_CNTRL > environment setting) would probably correct the problem, It *can* be the heap issue. I've seen it happened. If you know how to use 'svmon -P' and read the result, you can find out if it is a heap issue easily. Or, you can just try LDR_CNTRL, to see if you get more pthreads, or more running time, before hit the EAGAIN again if ever. > according to a message Gary Hook posted to c.u.a last year, it should > only be possible to get EAGAIN if WLM is running.[1] And it's not, > according to wlmcntrl(1) - as I'd expect, since no one should be > using it on this particular test system. I guess it came from the AIX doc on pthread_create(): http://www16.boulder.ibm.com/doc_lin...ead_create.htm " EAGAIN: If WLM is running, the limit on the number of threads in the class may have been met. " This is just not complete. BTW, although there is a definition of PTHREAD_THREADS_MAX in AIX, I've yet to test if it's a hard limit in the implementation. #ifdef _LARGE_THREADS #define PTHREAD_THREADS_MAX 32767 #else #define PTHREAD_THREADS_MAX 512 #endif Tao |
| |||
| On Thu, 8 Apr 2004, Michael Wojcik wrote: MW > Date: 8 Apr 2004 20:52:09 GMT MW > From: Michael Wojcik <mwojcik@newsguy.com> MW > Newsgroups: comp.unix.aix MW > Subject: pthread_create returns EAGAIN without WLM MW > MW > MW > I have a multithreaded application running on 5.2.0. We're getting MW > EAGAIN from pthread_create after a certain number of threads have MW > been created. I'm assuming we're running into a heap constraint, and MW > setting a larger heap (with -bmaxdata or the equivalent LDR_CNTRL MW > environment setting) would probably correct the problem, but MW > according to a message Gary Hook posted to c.u.a last year, it should MW > only be possible to get EAGAIN if WLM is running.[1] And it's not, MW > according to wlmcntrl(1) - as I'd expect, since no one should be MW > using it on this particular test system. MW > [...] Pthreads require VM space for their individual stacks. I suspect that at some point the pthread_create cannot allocate VM for the stack of a pthread. Michael Thomadakis |
| ||||
| In article <Pine.SGI.4.56.0404131402020.207155@hellas.tamu.ed u>, "Michael E. Thomadakis" <miket@hellas.tamu.edu> writes: > On Thu, 8 Apr 2004, Michael Wojcik wrote: > > MW > I have a multithreaded application running on 5.2.0. We're getting > MW > EAGAIN from pthread_create after a certain number of threads have > MW > been created. I'm assuming we're running into a heap constraint, and > MW > setting a larger heap (with -bmaxdata or the equivalent LDR_CNTRL > MW > environment setting) would probably correct the problem, but > MW > according to a message Gary Hook posted to c.u.a last year, it should > MW > only be possible to get EAGAIN if WLM is running.[1] And it's not, > MW > according to wlmcntrl(1) - as I'd expect, since no one should be > MW > using it on this particular test system. > > Pthreads require VM space for their individual stacks. I suspect that at some > point the pthread_create cannot allocate VM for the stack of a pthread. Correct, and thanks for replying, but my question was specifically why pthread_create was setting errno to EAGAIN if it was (per the documentation and Gary's post) allowed to do that iff WLM was running. The -bmaxdata / LDR_CNTRL behavior (the "large heap" feature) in fact does nothing more than increase the address space available for heap allocation, which is why I mentioned it. On AIX heap allocations are constrained by the smaller of the heap size and the process resource limit on data segment size. Hitting either should cause pthread_create to fail, but not with EAGAIN, if the documentation is correct. Apparently it is not. (And as expected increasing both heap size and resource limit raised the point at which pthread_create failed; the EAGAIN does indeed appear to have been caused by heap allocation failure.) On a side note, the process in question ran for a while after pthread_create started failing, and occasionally new thread creations would succeed because other threads had exited and been joined - just as you might expect - but after several cycles of this, it died with a SIGSEGV that dbx claimed was in pthread_create itself. I'm running 5200-01, and it's possible that this is a pthread bug that's fixed in in 5200-02, but I couldn't find an APAR about it. A bit worrisome. -- Michael Wojcik michael.wojcik@microfocus.com Proverbs for Paranoids, 1: You may never get to touch the Master, but you can tickle his creatures. -- Thomas Pynchon |