This is a discussion on free block header corruption within the Informix forums, part of the Database Server Software category; --> I received the following message in the online.log: 18:47:48 Assert Failed: Memory free block header corruption detected in mt_shm_malloc_segid ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I received the following message in the online.log: 18:47:48 Assert Failed: Memory free block header corruption detected in mt_shm_malloc_segid 1 18:47:48 Informix Dynamic Server Version 7.31.FC7XS 18:47:48 Who: Session(8013, informix@sappd3, 17925, 1541177696) Thread(9538, sqlexec, c0000000d5697cc0, 3) File: mtshpool.c Line: 2649 18:47:48 Results: Unable to repair pool 18:47:48 Action: Please notify Informix Technical Support. 18:47:48 stack trace for pid 8561 written to /local/db_tmp/af.292aa913 18:48:38 See Also: /local/db_tmp/af.292aa913, shmem.292aa913.0 After the error, many user connections were blocked, though new connections were allowed. The 'onstat -g ath' showed the blocked sqlexec threads with "mutex wait nsfock". I tried to terminate the sessions that were blocked, but every attempt failed. I then tried to shutdown the engine with 'onmode -yuk' but it just sat there. I finally had to kill the oninit process and cleanup the shared memory. >> Question 1: What is nsfock? >> Question 2: Is there any way to safely terminate a thread that is blocked on nsfock? Here's some information from the dump file: 18:47:48 Found during mt_shm_malloc_segid 1 18:47:48 Pool '8013' (0xc00000019478c028) 18:47:48 Bad free block 0xc000000194798f90 .... 18:47:48 Found during recover_pool_bad_free_block 2 18:47:48 Pool '8013' (0xc00000019478c028) 18:47:48 Bad free block 0xc0000001947a1d98 .... 18:47:48 Multiple block errors found 18:47:48 Informix Dynamic Server Version 7.31.FC7XS Software Serial Number ACN#xxxxxxxx 18:47:48 Assert Failed: Memory free block header corruption detected in mt_shm_malloc_segid 1 18:47:48 Who: Session(8013, informix@sappd3, 17925, 1541177696) Thread(9538, sqlexec, c0000000d5697cc0, 3) File: mtshpool.c Line: 2649 18:47:48 Results: Unable to repair pool 18:47:48 Action: Please notify Informix Technical Support. 18:47:48 Stack for thread: 9538 sqlexec base: 0xc00000019492e000 len: 270336 pc: 0x0000000000000000 tos: 0xc000000194932180 state: running vp: 3 ( 0) 0x40000000004eca3c legacy_hp_afstack + 0x24c [/informix/PRD/bin/oninit] ( 1) 0x40000000004ebf48 afstack + 0x68 [/informix/PRD/bin/oninit] ( 2) 0x40000000004eb32c afhandler + 0x644 [/informix/PRD/bin/oninit] ( 3) 0x40000000004eac04 affail_interface + 0x54 [/informix/PRD/bin/oninit] ( 4) 0x40000000004df5d0 recover_pool_bad_free_block + 0x220 [/informix/PRD/bin/oninit] ( 5) 0x40000000004dbcfc mt_shm_malloc_segid + 0x214 [/informix/PRD/bin/oninit] ( 6) 0x40000000004db940 mt_shm_malloc + 0x80 [/informix/PRD/bin/oninit] ( 7) 0x40000000004dd328 mt_shm_realloc + 0x208 [/informix/PRD/bin/oninit] ( 8) 0x40000000004a8638 deccvt + 0x3b0 [/informix/PRD/bin/oninit] ( 9) 0x40000000004a6ef4 dectoasc + 0x94 [/informix/PRD/bin/oninit] (10) 0x4000000000491100 rvalstr + 0x498 [/informix/PRD/bin/oninit] (11) 0x40000000001c85ec smi_printconst + 0xfc [/informix/PRD/bin/oninit] (12) 0x40000000001c7f3c smi_printexpr + 0xe34 [/informix/PRD/bin/oninit] (13) 0x40000000001c7ea8 smi_printexpr + 0xda0 [/informix/PRD/bin/oninit] (14) 0x40000000001c6b7c smi_printfilter + 0xe4 [/informix/PRD/bin/oninit] (15) 0x40000000001c4d58 smi_opexplain + 0x3a8 [/informix/PRD/bin/oninit] (16) 0x40000000001c984c smi_storeexplainflags + 0x84 [/informix/PRD/bin/oninit] (17) 0x40000000001c3eb4 prconblock + 0x364 [/informix/PRD/bin/oninit] (18) 0x40000000003b761c pstread + 0x8ac [/informix/PRD/bin/oninit] (19) 0x4000000000305a44 pst_rsread + 0x56c [/informix/PRD/bin/oninit] (20) 0x40000000003066dc rsread + 0xabc [/informix/PRD/bin/oninit] (21) 0x4000000000555b14 fmread + 0x63c [/informix/PRD/bin/oninit] (22) 0x4000000000151984 sqisread + 0x2c [/informix/PRD/bin/oninit] (23) 0x400000000015aa64 readidx + 0x3e4 [/informix/PRD/bin/oninit] (24) 0x4000000000159ea8 gettupl + 0x2d8 [/informix/PRD/bin/oninit] (25) 0x4000000000157ccc scan_next + 0x204 [/informix/PRD/bin/oninit] (26) 0x40000000002a742c inner_next + 0x1ac [/informix/PRD/bin/oninit] (27) 0x40000000002a6c9c join_next + 0xf4 [/informix/PRD/bin/oninit] (28) 0x40000000002a6dc8 join_next + 0x220 [/informix/PRD/bin/oninit] (29) 0x400000000015d32c filltemp + 0x3ac [/informix/PRD/bin/oninit] (30) 0x40000000001578c0 scan_open + 0x2e8 [/informix/PRD/bin/oninit] (31) 0x40000000002a82bc group_open + 0x434 [/informix/PRD/bin/oninit] (32) 0x400000000016bfe8 sort_open + 0x90 [/informix/PRD/bin/oninit] (33) 0x400000000015eff8 prepselect + 0x5e0 [/informix/PRD/bin/oninit] (34) 0x400000000020ede8 open_cursor + 0x468 [/informix/PRD/bin/oninit] (35) 0x400000000020e8b8 sq_open + 0x58 [/informix/PRD/bin/oninit] (36) 0x4000000000221978 sqmain + 0x100 [/informix/PRD/bin/oninit] (37) 0x40000000004c8a88 startup + 0xd8 [/informix/PRD/bin/oninit] (38) 0x40000000004e11fc resume + 0x10c [/informix/PRD/bin/oninit] .... ===========------------- - - - - - - /informix/PRD/bin/onstat -g ses 8013: Informix Dynamic Server Version 7.31.FC7XS -- On-Line -- Up 2 days 19:51:06 -- 5041920 Kbytes session #RSAM total used id user tty pid hostname threads memory memory 8013 informix - 17925 sappd3 1 450560 442504 tid name rstcb flags curstk status 9538 sqlexec c0000000d5697cc0 ---PR-- 254960 c0000000d5697cc0running Memory pools count 1 name class addr totalsize freesize #allocfrag #freefrag Changing data structure forced command termination. .... ===========------------- - - - - - - /informix/PRD/bin/onstat -g sql 8013: Informix Dynamic Server Version 7.31.FC7XS -- On-Line -- Up 2 days 19:51:06 -- 5041920 Kbytes Sess SQL Current Iso Lock SQL ISAM F.E. Id Stmt type Database Lvl Mode ERR ERR Vers 8013 SELECT sysmaster DR Not Wait 0 0 7.31 Current statement name : unlcur Current SQL statement : select sqx_sessionid, max(substr(sqx_selflag,4)), max(sqx_estcost), max(sqx_estrows) from syssqexplain where sqx_sessionid in (53,54,92,93,96,161,235,257,258,264,280,290,291,29 5,307,309,334,341,346,383 ,390,394,459,483,502,511,515,575,584,587,624,639,6 42,694,695,707,743,750,77 7,805,812,997,1133,1308,1410,3365,3441,4058,4114,4 437,4456,4503,4530,4539,4 563,4705,4830,5110,6071,6071,6071,7712) and sqx_iscurrent="Y" and sqx_ismain="Y" group by 1 order by 1 Last parsed SQL statement : select sqx_sessionid, max(substr(sqx_selflag,4)), max(sqx_estcost), max(sqx_estrows) from syssqexplain where sqx_sessionid in (53,54,92,93,96,161,235,257,258,264,280,290,291,29 5,307,309,334,341,346,383 ,390,394,459,483,502,511,515,575,584,587,624,639,6 42,694,695,707,743,750,77 7,805,812,997,1133,1308,1410,3365,3441,4058,4114,4 437,4456,4503,4530,4539,4 563,4705,4830,5110,6071,6071,6071,7712) and sqx_iscurrent="Y" and sqx_ismain="Y" group by 1 order by 1 >> Question 3: Could there have been a stack overflow that wasn't caught? 'onstat -c' had STACKSIZE = 256 (262144) 'onstat -g ses' had curstk = 254960 'onstat -g stk' had len = 270336 >> Question 4: The query above is used by a monitoring program, and is run frequently. Is there something wrong with the syntax? Comments are appreciated, Tim sending to informix-list |