Unix Technical Forum

free block header corruption

This is a discussion on free block header corruption within the Informix forums, part of the Database Server Software category; --> I received the following message in the online.log: 18:47:48 Assert Failed: Memory free block header corruption detected in mt_shm_malloc_segid ...


Go Back   Unix Technical Forum > Database Server Software > Informix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 07:29 PM
timothy.a.brown@marconi.com
 
Posts: n/a
Default free block header corruption


I received the following message in the online.log:

18:47:48 Assert Failed: Memory free block header corruption detected in
mt_shm_malloc_segid 1
18:47:48 Informix Dynamic Server Version 7.31.FC7XS
18:47:48 Who: Session(8013, informix@sappd3, 17925, 1541177696)
Thread(9538, sqlexec, c0000000d5697cc0, 3)
File: mtshpool.c Line: 2649
18:47:48 Results: Unable to repair pool
18:47:48 Action: Please notify Informix Technical Support.
18:47:48 stack trace for pid 8561 written to /local/db_tmp/af.292aa913
18:48:38 See Also: /local/db_tmp/af.292aa913, shmem.292aa913.0

After the error, many user connections were blocked, though new connections
were allowed. The 'onstat -g ath' showed the blocked sqlexec threads with
"mutex wait nsfock". I tried to terminate the sessions that were blocked,
but every attempt failed. I then tried to shutdown the engine with 'onmode
-yuk' but it just sat there. I finally had to kill the oninit process and
cleanup the shared memory.

>> Question 1: What is nsfock?


>> Question 2: Is there any way to safely terminate a thread that is

blocked on nsfock?

Here's some information from the dump file:

18:47:48 Found during mt_shm_malloc_segid 1
18:47:48 Pool '8013' (0xc00000019478c028)
18:47:48 Bad free block 0xc000000194798f90
....
18:47:48 Found during recover_pool_bad_free_block 2
18:47:48 Pool '8013' (0xc00000019478c028)
18:47:48 Bad free block 0xc0000001947a1d98
....
18:47:48 Multiple block errors found
18:47:48 Informix Dynamic Server Version 7.31.FC7XS Software Serial
Number ACN#xxxxxxxx
18:47:48 Assert Failed: Memory free block header corruption detected in
mt_shm_malloc_segid 1
18:47:48 Who: Session(8013, informix@sappd3, 17925, 1541177696)
Thread(9538, sqlexec, c0000000d5697cc0, 3)
File: mtshpool.c Line: 2649
18:47:48 Results: Unable to repair pool
18:47:48 Action: Please notify Informix Technical Support.
18:47:48 Stack for thread: 9538 sqlexec

base: 0xc00000019492e000
len: 270336
pc: 0x0000000000000000
tos: 0xc000000194932180
state: running
vp: 3

( 0) 0x40000000004eca3c legacy_hp_afstack + 0x24c
[/informix/PRD/bin/oninit]
( 1) 0x40000000004ebf48 afstack + 0x68 [/informix/PRD/bin/oninit]
( 2) 0x40000000004eb32c afhandler + 0x644 [/informix/PRD/bin/oninit]
( 3) 0x40000000004eac04 affail_interface + 0x54
[/informix/PRD/bin/oninit]
( 4) 0x40000000004df5d0 recover_pool_bad_free_block + 0x220
[/informix/PRD/bin/oninit]
( 5) 0x40000000004dbcfc mt_shm_malloc_segid + 0x214
[/informix/PRD/bin/oninit]
( 6) 0x40000000004db940 mt_shm_malloc + 0x80 [/informix/PRD/bin/oninit]
( 7) 0x40000000004dd328 mt_shm_realloc + 0x208
[/informix/PRD/bin/oninit]
( 8) 0x40000000004a8638 deccvt + 0x3b0 [/informix/PRD/bin/oninit]
( 9) 0x40000000004a6ef4 dectoasc + 0x94 [/informix/PRD/bin/oninit]
(10) 0x4000000000491100 rvalstr + 0x498 [/informix/PRD/bin/oninit]
(11) 0x40000000001c85ec smi_printconst + 0xfc
[/informix/PRD/bin/oninit]
(12) 0x40000000001c7f3c smi_printexpr + 0xe34
[/informix/PRD/bin/oninit]
(13) 0x40000000001c7ea8 smi_printexpr + 0xda0
[/informix/PRD/bin/oninit]
(14) 0x40000000001c6b7c smi_printfilter + 0xe4
[/informix/PRD/bin/oninit]
(15) 0x40000000001c4d58 smi_opexplain + 0x3a8
[/informix/PRD/bin/oninit]
(16) 0x40000000001c984c smi_storeexplainflags + 0x84
[/informix/PRD/bin/oninit]
(17) 0x40000000001c3eb4 prconblock + 0x364 [/informix/PRD/bin/oninit]
(18) 0x40000000003b761c pstread + 0x8ac [/informix/PRD/bin/oninit]
(19) 0x4000000000305a44 pst_rsread + 0x56c [/informix/PRD/bin/oninit]
(20) 0x40000000003066dc rsread + 0xabc [/informix/PRD/bin/oninit]
(21) 0x4000000000555b14 fmread + 0x63c [/informix/PRD/bin/oninit]
(22) 0x4000000000151984 sqisread + 0x2c [/informix/PRD/bin/oninit]
(23) 0x400000000015aa64 readidx + 0x3e4 [/informix/PRD/bin/oninit]
(24) 0x4000000000159ea8 gettupl + 0x2d8 [/informix/PRD/bin/oninit]
(25) 0x4000000000157ccc scan_next + 0x204 [/informix/PRD/bin/oninit]
(26) 0x40000000002a742c inner_next + 0x1ac [/informix/PRD/bin/oninit]
(27) 0x40000000002a6c9c join_next + 0xf4 [/informix/PRD/bin/oninit]
(28) 0x40000000002a6dc8 join_next + 0x220 [/informix/PRD/bin/oninit]
(29) 0x400000000015d32c filltemp + 0x3ac [/informix/PRD/bin/oninit]
(30) 0x40000000001578c0 scan_open + 0x2e8 [/informix/PRD/bin/oninit]
(31) 0x40000000002a82bc group_open + 0x434 [/informix/PRD/bin/oninit]
(32) 0x400000000016bfe8 sort_open + 0x90 [/informix/PRD/bin/oninit]
(33) 0x400000000015eff8 prepselect + 0x5e0 [/informix/PRD/bin/oninit]
(34) 0x400000000020ede8 open_cursor + 0x468 [/informix/PRD/bin/oninit]
(35) 0x400000000020e8b8 sq_open + 0x58 [/informix/PRD/bin/oninit]
(36) 0x4000000000221978 sqmain + 0x100 [/informix/PRD/bin/oninit]
(37) 0x40000000004c8a88 startup + 0xd8 [/informix/PRD/bin/oninit]
(38) 0x40000000004e11fc resume + 0x10c [/informix/PRD/bin/oninit]

....
===========------------- - - - - - -
/informix/PRD/bin/onstat -g ses 8013:

Informix Dynamic Server Version 7.31.FC7XS -- On-Line -- Up 2 days
19:51:06 -- 5041920 Kbytes

session #RSAM total used
id user tty pid hostname threads memory memory
8013 informix - 17925 sappd3 1 450560 442504

tid name rstcb flags curstk status
9538 sqlexec c0000000d5697cc0 ---PR-- 254960
c0000000d5697cc0running

Memory pools count 1
name class addr totalsize freesize #allocfrag
#freefrag
Changing data structure forced command termination.

....
===========------------- - - - - - -
/informix/PRD/bin/onstat -g sql 8013:

Informix Dynamic Server Version 7.31.FC7XS -- On-Line -- Up 2 days
19:51:06 -- 5041920 Kbytes

Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers
8013 SELECT sysmaster DR Not Wait 0 0 7.31

Current statement name : unlcur

Current SQL statement :
select sqx_sessionid, max(substr(sqx_selflag,4)), max(sqx_estcost),
max(sqx_estrows) from syssqexplain where
sqx_sessionid
in

(53,54,92,93,96,161,235,257,258,264,280,290,291,29 5,307,309,334,341,346,383

,390,394,459,483,502,511,515,575,584,587,624,639,6 42,694,695,707,743,750,77

7,805,812,997,1133,1308,1410,3365,3441,4058,4114,4 437,4456,4503,4530,4539,4
563,4705,4830,5110,6071,6071,6071,7712) and sqx_iscurrent="Y"
and sqx_ismain="Y" group by 1 order by 1

Last parsed SQL statement :
select sqx_sessionid, max(substr(sqx_selflag,4)), max(sqx_estcost),
max(sqx_estrows) from syssqexplain where
sqx_sessionid
in

(53,54,92,93,96,161,235,257,258,264,280,290,291,29 5,307,309,334,341,346,383

,390,394,459,483,502,511,515,575,584,587,624,639,6 42,694,695,707,743,750,77

7,805,812,997,1133,1308,1410,3365,3441,4058,4114,4 437,4456,4503,4530,4539,4
563,4705,4830,5110,6071,6071,6071,7712) and sqx_iscurrent="Y"
and sqx_ismain="Y" group by 1 order by 1

>> Question 3: Could there have been a stack overflow that wasn't caught?


'onstat -c' had STACKSIZE = 256 (262144)
'onstat -g ses' had curstk = 254960
'onstat -g stk' had len = 270336

>> Question 4: The query above is used by a monitoring program, and is run

frequently. Is there something wrong with the syntax?

Comments are appreciated,
Tim


sending to informix-list
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:19 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com