Unix Technical Forum

How to investigate crash inside LIBC ?

This is a discussion on How to investigate crash inside LIBC ? within the AIX Operating System forums, part of the Unix Operating Systems category; --> Hi All. AIX 5.2 (uname -a = "AIX {OurServer} 2 5 000AC99D4C00"). I have a problem - crash (illegal ...


Go Back   Unix Technical Forum > Unix Operating Systems > AIX Operating System

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-05-2008, 06:11 AM
Dmitry Bond.
 
Posts: n/a
Default How to investigate crash inside LIBC ?

Hi All.

AIX 5.2 (uname -a = "AIX {OurServer} 2 5 000AC99D4C00").

I have a problem - crash (illegal instruction) occured somewhere inside in
LIBC.A.
Or course more probably it happened due to bug in my program (build in form
of shared lib) but how I find where it happened?
Could you please help me with this?
The test program do very simple thing - load my shared library, and exits
(source code see below).
Test program and shared lib compiled with GCC 3.3.3.
As you can see test program really simple then I assume that source of
problem is my shared lib but how to find the crash point ?...
The GDB shows the following:

745 return 0;
(gdb)
746 }
(gdb) stepi
0x10002cf4 746 }
(gdb)
0x10002cf8 in main (argc=2, argv=0x2ff22934) at tcpkick.cpp:746
746 }
(gdb)
0x10002cfc 746 }
(gdb)
0x10002d00 746 }
(gdb)
0x10002d04 746 }
(gdb)
0x10002d08 746 }
(gdb)
0x100001dc in __start ()
(gdb)
0x100001e0 in __start ()
(gdb)
0x100001e4 in __start ()
(gdb)
0x100001e8 in __start ()
(gdb)
0x100001f0 in __start ()
(gdb)
0x1000f110 in exit ()
(gdb)
0x1000f114 in exit ()
(gdb)
0x1000f118 in exit ()
(gdb)
0x1000f11c in exit ()
(gdb)
0x1000f120 in exit ()
(gdb)
0x1000f124 in exit ()
(gdb)
0xd01e4124 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4128 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e412c in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4130 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4134 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4138 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e413c in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4140 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e417c in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4180 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4184 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4188 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e418c in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4190 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4194 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e4198 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e419c in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e41a0 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e41a4 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01e41a8 in exit () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef88 in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef8c in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef90 in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef94 in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef98 in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)
0xd01cef9c in _ptrgl () from /usr/lib/libc.a(shr.o)
(gdb)

Program received signal SIGILL, Illegal instruction.
0x00000000 in ?? ()
(gdb)

WBR,
Dmitry.

ps. source code of test program:
===begin===
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <dlfcn.h>
int main( int argc, char* argv[] )
{
if (argc <= 1)
{
printf("\nUsage: zload lib_name [lib_name]");
return 0;
}
for (int i=1; i<argc; i++)
{
printf("\nLoading \"%s\"... ", argv[i]);
void* h = dlopen(argv[i], RTLD_NOW);
if (h == NULL)
{
char* info = dlerror();
printf("error (%d). Msg=\"%s\"",
errno, (info != NULL ? info : "(null)") );
}
}
printf("\ndone.\nexiting...\n");
return 0;
}
===end===

when run it prints:

Loading "xddapplib.so"...
done.
exiting...
Illegal instruction (core dumped)


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-05-2008, 06:11 AM
Dmitry Bond.
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?

Btw, is is possible at all to see (for debug purposes) sources of libc ?


"Dmitry Bond." <dima_ben@ukr.net> wrote in message
news:1100625415.425707@moxa.united.net.ua...
> Hi All.

[...]
> (gdb)
> 0xd01cef98 in _ptrgl () from /usr/lib/libc.a(shr.o)
> (gdb)
> 0xd01cef9c in _ptrgl () from /usr/lib/libc.a(shr.o)
> (gdb)
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-05-2008, 06:12 AM
Paul Pluzhnikov
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?

"Dmitry Bond." <dima_ben@ukr.net> writes:

> AIX 5.2 (uname -a = "AIX {OurServer} 2 5 000AC99D4C00").
>
> I have a problem - crash (illegal instruction) occured somewhere inside in
> LIBC.A.


Actually, the crash is most likely happening because libc had an
exit-handler registered with it, and that handler jumped to
location 0. You may wish to set a breakpoint on atexit(), and see
what handlers are registered at the time of dlopen().

The problem is most likely due to incorrect building of the
shared lib. How did you build it?

> Btw, is is possible at all to see (for debug purposes) sources of libc ?


No, they are not released to the public.

BTW, please do not top-post.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-05-2008, 06:12 AM
Dmitry Bond.
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?

"Paul Pluzhnikov" <ppluzhnikov-nsp@charter.net> wrote in message
news:m37jolgzin.fsf@salmon.parasoft.com...
> "Dmitry Bond." <dima_ben@ukr.net> writes:

[...]
> location 0. You may wish to set a breakpoint on atexit(), and see
> what handlers are registered at the time of dlopen().


Thank you very much for your answer.
And sorry I forgot to thank you for your answer concerning the limiting set
of exported functions... :-)

> The problem is most likely due to incorrect building of the
> shared lib. How did you build it?


We are using gcc 3.3.3 and gmake 3.80 to build our binaries.
For compiling C sources we are using the command line:
gcc -c -x c -g -fexceptions -fpic -DDEBUG -D_DEBUG -D__XORA -D__UNIX -D__AIX
-DPROGNAME="xddapplib" -I/oracle/9201/xdk/include [...others -I skipped...]
unit_x1.c -o unit_x1.o

For compiling C++ sources we are using:
g++ -c -x
c++ -g -fpic -DDEBUG -D_DEBUG -D__XORA -D__UNIX -D__AIX -DPROGNAME="xddappli
b"
-I/oracle/9201/xdk/include [...others -I skipped...] unit_x2.cpp -o
unit_x2.o

And finally, link all that stuff with:
g++ -g -Wl,-bE:def_server.exp orautil.a libtux_stub.so -L/oracle/9201/lib32
-lclntsh -shared -o xddapplib.so unit_x1.o unit_x2.o [...others *.o...]

Can you see anything wrong here?

WBR,
Dmitry.



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 01-05-2008, 06:12 AM
Uli Link
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?


>>The problem is most likely due to incorrect building of the
>>shared lib. How did you build it?

>
>
> We are using gcc 3.3.3 and gmake 3.80 to build our binaries.


Was the gcc bootstrapped on this OS level, incl. maintenance level?
The gcc uses kinda wrapper around the system libc (libgcc).
Does your gcc uses ld from GNU binutils or the system ld?

Just a guess for a direction.
What's in your gcc's specs file?

--


Uli

(Reply to ulrich <dot> link <domain-delimiter> epost <dot> de)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 01-05-2008, 06:12 AM
Paul Pluzhnikov
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?

"Dmitry Bond." <dima_ben@ukr.net> writes:

> And finally, link all that stuff with: [...]
> Can you see anything wrong here?


No, I can't.

One thing you can do is look for "deferred resolution" symbols
(these are most likely to jump to 0), and (if there are any)
verify that a definition is available for each at runtime:

dump -Tv xddapplib.so | grep 'undef .* \.\.'

Other then that, see if my 'atexit' theory leads anywhere.

Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 01-05-2008, 06:12 AM
Dmitry Bond.
 
Posts: n/a
Default Re: How to investigate crash inside LIBC ?

"Paul Pluzhnikov" <ppluzhnikov-nsp@charter.net> wrote in message
news:m3fz38bjvc.fsf@salmon.parasoft.com...
> "Dmitry Bond." <dima_ben@ukr.net> writes:

[...]
> dump -Tv xddapplib.so | grep 'undef .* \.\.'
> Other then that, see if my 'atexit' theory leads anywhere.


Thank you. :-)
You were right.
I found a bug in my shared lib. There was NULL object pointer in
finalization code.
That bug never happened if some functions of library were called but occurs
if library was unloaded without using it...

WBR,
Dmitry.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:19 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com