Unix Technical Forum

ER problem---CDRACK cause rootserver crash

This is a discussion on ER problem---CDRACK cause rootserver crash within the Informix forums, part of the Database Server Software category; --> Hello, I have any problems with ER rootserver; the server often crashes..about 1-2 times per day.the system is Unixware ...


Go Back   Unix Technical Forum > Database Server Software > Informix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-20-2008, 10:27 AM
cristizaharioiu
 
Posts: n/a
Default ER problem---CDRACK cause rootserver crash

Hello,

I have any problems with ER rootserver; the server often
crashes..about 1-2 times per day.the system is Unixware 7 and IDS 9.20.
The error is :

16:20:23 Informix Dynamic Server 2000 Version 9.20.UC3
16:20:23 Who: Session(92, informix@sco00, 0, 390625404)
Thread(80, CDRACK_1, 17462bd8, 3)
File: mtex.c Line: 408
16:20:23 Results: Exception Caught. Type: MT_EX_OS, Context: mem
16:20:23 Action: Please notify Informix Technical Support.
16:20:23 stack trace for pid 17198 written to
/home2/informix/tmp/af.4380c56
16:20:23 See Also: /home2/informix/tmp/af.4380c56
.......
16:20:42 mtex.c, line 408, thread 80, proc id 17198, No Exception
Handler.
16:20:42 Fatal error in ADM VP at mt.c:11029
16:20:42 Unexpected virtual processor termination, pid = 17198, exit =
0x100

16:20:43 PANIC: Attempting to bring system down
16:20:43 semctl: errno = 22


Constantly the rootserver crashes by reason of CDRACK_0 or CDRACK_1.
The problem appears after I define new replicates between 2 leaf
servers connected directly to rootserver.

The architecture is here: sco00-rootserver
sco01,sco40,sco42,sco43,sco44,sco45,sco46
leaf servers connected directly to sco00.

I define new replicates primary target- data are replicated from
sco40-sco46 to sco01; this replicates replicate a lot of data...30-50
row(transaction)/ min. i think that this volume of data cause the crash
because before i define this new replicates I haven't this problem.

Here is part of onconfig and af. file genereted:
--onconfig
CDR_LOGBUFFERS 16384
CDR_EVALTHREADS 1,2 # evaluator threads
(per-cpu-vp,additional)
CDR_DSLOCKWAIT 300 # DS lockwait timeout (seconds)
#CDR_QUEUEMEM 4096 # Maximum amount of memory for any CDR
queue (Kbytes)
CDR_QUEUEMEM 16384
CDR_LOGDELTA 30 # % of log space allowed in queue
memory
CDR_NUMCONNECT 100 # Expected connections per server
CDR_NIFRETRY 300 # Connection retry (seconds)
CDR_NIFCOMPRESS 5 # Link level compression (-1 never, 0
none, 9 max)
--onstat -g ath

75 1816b1f0 174614d8 2 sleeping secs: 52 3cpu
CDRNsT117
77 1807fcf8 17461a98 2 cond wait CDRBlbslp 3cpu
CDRBLOB_0
78 182a2178 17462058 2 cond wait CDRBlbslp 3cpu
CDRBLOB_1
79 182af1a0 17462618 2 cond wait CDRAckslp 1cpu
CDRACK_0
*80 182bc1a0 17462bd8 2 running 3cpu
CDRACK_1
81 182c91f0 17463198 2 cond wait CDRDssleep 1cpu
CDRD_0
82 182d7178 17463758 2 cond wait CDRDssleep 1cpu
CDRD_1
83 182e4178 17463d18 2 cond wait netnorm 1cpu
CDRNr46

onstat -g stk 80 light:

Informix Dynamic Server 2000 Version 9.20.UC3 -- On-Line -- Up 1 days
05:11:08 -- 433028 Kbytes

Stack for thread: 80 CDRACK_1
base: 0x182c0018
len: 36864
pc: 0x0856f0eb
tos: 0x182c7dd8
state: running
vp: 3

0x08863d98 (*nosymtab*)0x8863d98



What can i do to avoid this problem ? Can I tuning any parameters on
onconfig file?

Also I think to define server sco01 as nonroot server and sco40..sco46
leaf servers connected to sco01..is it a good idea ? My expectation is
this architecture avoid replication from sco40-46 to sco01 through
sco00 so the data will be replicated directly and sco00 won't be
implicated...is it correct ?

Thank you in advance...
Cristian

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-20-2008, 10:27 AM
Madison Pruet
 
Posts: n/a
Default Re: ER problem---CDRACK cause rootserver crash


"cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message
news:1130489255.781893.43220@g44g2000cwa.googlegro ups.com...
> Hello,
>
> I have any problems with ER rootserver; the server often
> crashes..about 1-2 times per day.the system is Unixware 7 and IDS 9.20.
> The error is :
>
> 16:20:23 Informix Dynamic Server 2000 Version 9.20.UC3
> 16:20:23 Who: Session(92, informix@sco00, 0, 390625404)
> Thread(80, CDRACK_1, 17462bd8, 3)
> File: mtex.c Line: 408
> 16:20:23 Results: Exception Caught. Type: MT_EX_OS, Context: mem
> 16:20:23 Action: Please notify Informix Technical Support.
> 16:20:23 stack trace for pid 17198 written to
> /home2/informix/tmp/af.4380c56
> 16:20:23 See Also: /home2/informix/tmp/af.4380c56
> ......
> 16:20:42 mtex.c, line 408, thread 80, proc id 17198, No Exception
> Handler.
> 16:20:42 Fatal error in ADM VP at mt.c:11029
> 16:20:42 Unexpected virtual processor termination, pid = 17198, exit =
> 0x100
>
> 16:20:43 PANIC: Attempting to bring system down
> 16:20:43 semctl: errno = 22
>
>
> Constantly the rootserver crashes by reason of CDRACK_0 or CDRACK_1.
> The problem appears after I define new replicates between 2 leaf
> servers connected directly to rootserver.
>
> The architecture is here: sco00-rootserver
> sco01,sco40,sco42,sco43,sco44,sco45,sco46
> leaf servers connected directly to sco00.
>
> I define new replicates primary target- data are replicated from
> sco40-sco46 to sco01; this replicates replicate a lot of data...30-50
> row(transaction)/ min. i think that this volume of data cause the crash
> because before i define this new replicates I haven't this problem.


I don't think this is the case. We have customers replicating in the
thousands of transactions a second.

>
> Here is part of onconfig and af. file genereted:
> --onconfig
> CDR_LOGBUFFERS 16384
> CDR_EVALTHREADS 1,2 # evaluator threads
> (per-cpu-vp,additional)
> CDR_DSLOCKWAIT 300 # DS lockwait timeout (seconds)
> #CDR_QUEUEMEM 4096 # Maximum amount of memory for any CDR
> queue (Kbytes)
> CDR_QUEUEMEM 16384
> CDR_LOGDELTA 30 # % of log space allowed in queue
> memory
> CDR_NUMCONNECT 100 # Expected connections per server
> CDR_NIFRETRY 300 # Connection retry (seconds)
> CDR_NIFCOMPRESS 5 # Link level compression (-1 never, 0
> none, 9 max)
> --onstat -g ath
>
> 75 1816b1f0 174614d8 2 sleeping secs: 52 3cpu
> CDRNsT117
> 77 1807fcf8 17461a98 2 cond wait CDRBlbslp 3cpu
> CDRBLOB_0
> 78 182a2178 17462058 2 cond wait CDRBlbslp 3cpu
> CDRBLOB_1
> 79 182af1a0 17462618 2 cond wait CDRAckslp 1cpu
> CDRACK_0
> *80 182bc1a0 17462bd8 2 running 3cpu
> CDRACK_1
> 81 182c91f0 17463198 2 cond wait CDRDssleep 1cpu
> CDRD_0
> 82 182d7178 17463758 2 cond wait CDRDssleep 1cpu
> CDRD_1
> 83 182e4178 17463d18 2 cond wait netnorm 1cpu
> CDRNr46
>
> onstat -g stk 80 light:
>
> Informix Dynamic Server 2000 Version 9.20.UC3 -- On-Line -- Up 1 days
> 05:11:08 -- 433028 Kbytes
>
> Stack for thread: 80 CDRACK_1
> base: 0x182c0018
> len: 36864
> pc: 0x0856f0eb
> tos: 0x182c7dd8
> state: running
> vp: 3
>
> 0x08863d98 (*nosymtab*)0x8863d98


We are going to have to get the stack somehow. It might be worth it to set
AFDEBUG so that
instead of just crashing that the server will hang. That would make it
possible to attach to the server
while it is in the process of crashing with a debugger and get a stack.


>
>
>
> What can i do to avoid this problem ? Can I tuning any parameters on
> onconfig file?


I would try turning off compression. I don't know if that would help, but
it's worth a try.

>
> Also I think to define server sco01 as nonroot server and sco40..sco46
> leaf servers connected to sco01..is it a good idea ? My expectation is
> this architecture avoid replication from sco40-46 to sco01 through
> sco00 so the data will be replicated directly and sco00 won't be
> implicated...is it correct ?


sco00 will forward the transactions to sco01. Sco01 may not participate
in the replicated tables, but it will participate in the network flow.

>
> Thank you in advance...
> Cristian
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-20-2008, 10:28 AM
cristizaharioiu
 
Posts: n/a
Default Re: ER problem---CDRACK cause rootserver crash

Thank you Madison,

At the first step I would try to turn off compression ...it's necessary
to turn off compression on sco00 or both sco00 and sco01 ?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-20-2008, 10:28 AM
mpruet@comcast.net
 
Posts: n/a
Default Re: ER problem---CDRACK cause rootserver crash


cristizaharioiu wrote:
> Thank you Madison,
>
> At the first step I would try to turn off compression ...it's necessary
> to turn off compression on sco00 or both sco00 and sco01 ?


compression is negotiated. That means that you only have to set it on
one of the nodes

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-20-2008, 10:28 AM
mpruet@comcast.net
 
Posts: n/a
Default Re: ER problem---CDRACK cause rootserver crash


cristizaharioiu wrote:
> Thank you Madison,
>
> At the first step I would try to turn off compression ...it's necessary
> to turn off compression on sco00 or both sco00 and sco01 ?


compression is negotiated. That means that you only have to set it on
one of the nodes

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-20-2008, 10:28 AM
caver
 
Posts: n/a
Default Re: ER problem---CDRACK cause rootserver crash

Cristian

In the long term, I would suggest upgrading to at least version 9.4 - I
had lots of ER crashes at 9.21, but after I was finally able to get to
9.4, ER has been much more robust. Warning - for my configuration it
took a lot of work to upgrade my root server to 9.4 - but it was worth
it.

Daniel

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:36 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com