View Single Post

   
  #2 (permalink)  
Old 01-05-2008, 11:15 AM
Paul Pluzhnikov
 
Posts: n/a
Default Re: Spurious segv's and malloc catastrophe

jimwgramling@gmail.com writes:

> I've got an application built on a pseries machine under AIX 5.3.
> Under certain circumstances (which we have yet to identify
> precisely!), the application may hang, segv, throw a memory error, or
> even a "catastrophe in malloc" error.


All typical symptoms of heap corruption...

> Since restarting just the application isn't enough to resolve the
> problem (even rebuilding the app doesn't do it), it seems like we may
> be looking at some kind of memory corruption issue.


It is extremely likely that you are looking at heap corruption.

> Unfortunately,
> I've never been onsite when the problem crops up (fortunately it's not
> *too* frequent!), so I don't know what was going on with the machine
> (paging, filesystem full, etc.).


It wouldn't have helped anyway -- by the time you observe the
problem, the part of code that actually did the damage is likely
nowhere to be seen (not currently on the stack).

> For the thread-safe versions of the reentrant C runtime start-up
> routines ? crt0_r.o ,mcrt0_r.o, gcrt0_r.o ?


On AIX 5.1, crt0_r.o is a symlink to crt0.o.
Most likely the same is true on AIX 5.3.

> Any helpful suggestions will be greatly appreciated!


You need a heap debugger.

Start with this one:
http://www.redbooks.ibm.com/redbooks...tml/wwhelp.htm
and go to 4.2.4 The debug malloc allocator.

Your other options are (commercial) Insure++, Purify and ZeroFault.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Reply With Quote