This is a discussion on Posix mutex working in Linux but not Solaris within the comp.unix.solaris forums, part of the Solaris Operating System category; --> Hi I created my_mutex of posixMutex type (basically pthread_mutex_t type) accessed by 2 threads, which execute the same section ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi I created my_mutex of posixMutex type (basically pthread_mutex_t type) accessed by 2 threads, which execute the same section of code: void run() { my_mutex.lock(); /*calculates sth here*/ my_mutex.unlock(); } It runs perfectly on Linux with arbitrary number of threads, but on Solaris (ver. 8) it prints the following and exits without a core dump when only 2 threads are created. ERROR -- call to pthread_mutex_lock() -- called on locked mutex Has anybody seen anything like this before? Thanks. ***** The lock and unlock functions are borrowed from online, which take care of recursive locking situations: void posixMutex::lock() { if (m_dontLock) return; threadID thisthreadID = getThisThreadID(); if (isProcessLocked() && lockedBy() == thisthreadID) { m_state.m_lockCount++; return; } int retVal = pthread_mutex_lock(&m_mutex); if (retVal != 0) { throw exception_INTERNAL("Fail to mutex locking"); } // This is the correct order to set these in to avoid a race- condition // with the test above the lock call... m_state.m_lockedBy = thisthreadID; m_state.m_lockCount = 1; m_lockingProcessID = getThisProcessID(); } void posixMutex::unlock() { if (m_dontLock) return; if (isProcessLocked() && lockedBy() == getThisThreadID()) { if (--m_state.m_lockCount > 0) return; } else { m_state.m_lockCount = 0; } m_lockingProcessID = 0; m_state.m_lockedBy = 0; int retVal = pthread_mutex_unlock(&m_mutex); if (retVal != 0) { throw exception_INTERNAL("fail to unlock the mutex"); } } |
| |||
| Cheng wrote: > Hi I created my_mutex of posixMutex type (basically pthread_mutex_t > type) accessed by 2 threads, which execute the same section of code: > > void run() { > > my_mutex.lock(); > > /*calculates sth here*/ > > my_mutex.unlock(); > > } > > > It runs perfectly on Linux with arbitrary number of threads, but on > Solaris (ver. 8) it prints the following and exits without a core dump > when only 2 threads are created. > > ERROR -- call to pthread_mutex_lock() -- called on locked mutex > The error says it all, you didn't create a PTHREAD_MUTEX_RECURSIVE mutex (you don't show the initialisation code) and you are attempting to re-lock a locked mutex from the thread holding the lock. -- Ian Collins. |
| |||
| On Thu, 7 Feb 2008 18:28:26 -0800 (PST) Cheng <cheng.stillsea@gmail.com> wrote: > Hi I created my_mutex of posixMutex type (basically pthread_mutex_t > type) accessed by 2 threads, which execute the same section of code: > > void run() { > > my_mutex.lock(); > > /*calculates sth here*/ > > my_mutex.unlock(); > > } > > > It runs perfectly on Linux with arbitrary number of threads, but on > Solaris (ver. 8) it prints the following and exits without a core dump > when only 2 threads are created. > > ERROR -- call to pthread_mutex_lock() -- called on locked mutex > > > Has anybody seen anything like this before? Thanks. Yes, lots. You have a buggy program. > ***** The lock and unlock functions are borrowed from online, which > take care of recursive locking situations: you don't have a recursive locking problem, you have a contended mutex. > void > posixMutex::lock() > { > if (m_dontLock) > return; > > threadID thisthreadID = getThisThreadID(); > > if (isProcessLocked() && lockedBy() == thisthreadID) > { > m_state.m_lockCount++; > return; > } here, the mutex is "locked" recursively if the thread requesting a new lock is the thread that already holds the lock. fine. but if the thread requesting the lock is DIFFERENT than the thread that holds the lock, we continue on ... > int retVal = pthread_mutex_lock(&m_mutex); to here, and this will fail. which is the correct action. the reason it works on Linux is that the mutex is never contended, due to thread scheduling differences. BTW, the class seems pointless since pthreads provides recursive mutexes natively. also, the class is buggy since it doesn't test for overflow of m_state.m_lockCount (i assume m_state is a struct and m_lockCount is an integral type and not an object which will throw an exception on overflow). -frank |
| |||
| In article <ebbd07ae-572d-4adc-aa24-fa0d557c153a@i7g2000prf.googlegroups.com>, Cheng <cheng.stillsea@gmail.com> writes: > Hi I created my_mutex of posixMutex type (basically pthread_mutex_t > type) accessed by 2 threads, which execute the same section of code: > > void run() { > > my_mutex.lock(); > > /*calculates sth here*/ > > my_mutex.unlock(); > > } > > > It runs perfectly on Linux with arbitrary number of threads, but on > Solaris (ver. 8) it prints the following and exits without a core dump > when only 2 threads are created. It's broken -- I'm not surprised. > ERROR -- call to pthread_mutex_lock() -- called on locked mutex > > > Has anybody seen anything like this before? Thanks. > > > ***** The lock and unlock functions are borrowed from online, which > take care of recursive locking situations: They are horribly broken and stand no chance of working at all. > void > posixMutex::lock() > { > if (m_dontLock) > return; > > threadID thisthreadID = getThisThreadID(); > > if (isProcessLocked() && lockedBy() == thisthreadID) > { > m_state.m_lockCount++; What? It's fiddling around inside the mutex. How did you even get this to compile on Solaris? > return; > } > > int retVal = pthread_mutex_lock(&m_mutex); > > if (retVal != 0) { > throw exception_INTERNAL("Fail to mutex locking"); > } > > // This is the correct order to set these in to avoid a race- > condition > // with the test above the lock call... > > m_state.m_lockedBy = thisthreadID; > m_state.m_lockCount = 1; > m_lockingProcessID = getThisProcessID(); > } To roll your own recursive mutex, you will need to use a condition variable, not a plain mutex at the lower level. Alternatively, upgrade to Solaris 10 which supports recursive mutexs in the standard pthread_mutex calls. I tend to find recursive mutexs are used as a workaround for sloppy programming -- I've never yet come across a case where they were really required, and have come across many cases where their use was incorrect, resulting in obscure and difficult to diagnose corruptions of the data they were supposedly protecting. -- Andrew Gabriel [email address is not usable -- followup in the newsgroup] |
| |||
| andrew@cucumber.demon.co.uk (Andrew Gabriel) writes: >To roll your own recursive mutex, you will need to use a condition >variable, not a plain mutex at the lower level. Alternatively, >upgrade to Solaris 10 which supports recursive mutexs in the >standard pthread_mutex calls. Early releases of pthreads also supported recursive mutexes. These were part of UNIX98 branding and included in Solaris 7. Casper -- Expressed in this posting are my opinions. They are in no way related to opinions held by my employer, Sun Microsystems. Statements on Sun products included here are not gospel and may be fiction rather than truth. |
| |||
| Frank Cusack wrote: > On 08 Feb 2008 20:23:04 GMT Casper H.S. Dik <Casper.Dik@Sun.COM> wrote: >> No; perhaps he just didn't link with -lpthread so all therad >> functions (in S9 and before) are noops. > > annoying. what was the rationale for including noops in libc as > opposed to letting that kind of error be detected at link time? It never was detected, libc had empty stubs. Very confusing at first! -- Ian Collins. |
| |||
| Frank Cusack wrote: > On Sat, 09 Feb 2008 08:33:35 +1300 Ian Collins <ian-news@hotmail.com> wrote: >> Your error appears to be cased by recursive locking. If your code does >> this, you have to use a recursive mutex which isn't the default on Solaris. > > if you look at his mutex class you'll see it never acquires the mutex > recursively. > That would appear to be the case, but I don't see how your explanation of contention could cause the problem. If a thread attempts to lock a mutex held by another, pthread_mutex_lock() blocks. Or am I missing something obvious? -- Ian Collins. |
| |||
| Andrew Gabriel wrote: > Cheng <cheng.stillsea@gmail.com> writes: > >> void >> posixMutex::lock() >> { >> if (m_dontLock) >> return; >> >> threadID thisthreadID = getThisThreadID(); >> >> if (isProcessLocked() && lockedBy() == thisthreadID) >> { >> m_state.m_lockCount++; > > What? It's fiddling around inside the mutex. > How did you even get this to compile on Solaris? > Look again, here he fiddles with m_state >> int retVal = pthread_mutex_lock(&m_mutex); here he locks m_mutex. -- Ian Collins. |
| |||
| On Sat, 09 Feb 2008 10:50:59 +1300 Ian Collins <ian-news@hotmail.com> wrote: > Andrew Gabriel wrote: >> Cheng <cheng.stillsea@gmail.com> writes: >> >>> void >>> posixMutex::lock() >>> { >>> if (m_dontLock) >>> return; >>> >>> threadID thisthreadID = getThisThreadID(); >>> >>> if (isProcessLocked() && lockedBy() == thisthreadID) >>> { >>> m_state.m_lockCount++; >> >> What? It's fiddling around inside the mutex. >> How did you even get this to compile on Solaris? >> > > Look again, here he fiddles with m_state yes, and? he's maintaining a recursion count just like would be done in libc/libpthread for a "native" recursive mutex. > >>> int retVal = pthread_mutex_lock(&m_mutex); > > here he locks m_mutex. one would presume that isProcessLocked() checks m_mutex. -frank |
| ||||
| Frank Cusack wrote: > On Sat, 09 Feb 2008 10:50:59 +1300 Ian Collins <ian-news@hotmail.com> wrote: >> Andrew Gabriel wrote: >>> Cheng <cheng.stillsea@gmail.com> writes: >>> >>>> void >>>> posixMutex::lock() >>>> { >>>> if (m_dontLock) >>>> return; >>>> >>>> threadID thisthreadID = getThisThreadID(); >>>> >>>> if (isProcessLocked() && lockedBy() == thisthreadID) >>>> { >>>> m_state.m_lockCount++; >>> What? It's fiddling around inside the mutex. >>> How did you even get this to compile on Solaris? >>> >> Look again, here he fiddles with m_state > > yes, and? he's maintaining a recursion count just like would be done > in libc/libpthread for a "native" recursive mutex. > I interpreted the comment as "fiddling around inside the pthread_mutex_t object". There's nothing in the code that would prevent it compiling on Solaris. Whether it is necessary or not isn't the issue. -- Ian Collins. |