vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| hello all! I try to define a new replicate using cdr define repl and I receive next error in online.log: 11:14:06 CDR Queuer: rqmQueueTxn (1000/0/0x2886/0x0) failed on CNTRL 11:14:06 CDR GC: operation sparse control message queue failed (error 0). the command is: cdr define repl -C ignore --scope=transaction --immed --ats --ris --floatcanon rnom_g_cursvalext \ "P bd_sco00@g_sco00:informix.curs_valutar_extins" "select * from curs_valutar_extins" \ "R bd_qbank_ghiseu@g_sco40:informix.curs_valutar_extins" "select * from curs_valutar_extins" \ "R bd_qbank_ghiseu@g_sco42:informix.curs_valutar_extins" "select * from curs_valutar_extins" \ "R bd_qbank_ghiseu@g_sco43:informix.curs_valutar_extins" "select * from curs_valutar_extins" \ ............ ............. "R bd_qbank_ghiseu@g_sco122:informix.curs_valutar_extins" "select * from curs_valutar_extins" When I defined replicate shell (/bin/bash) no error returned; the error appear only in online.log We have one root server, 22 nonroot server, 60 leaf server and about 700 replicates already defined; all works ok. We are running Unixware 7.1 and IDS 9.2 |
| |||
| Since this is an error while creating a sparse control message, I suspect that the problem is being encountered trying to forward the control message to one of the leaf servers. Are you defining the replicate on the root node? Is the error occuring on the root node or on one of the non-root servers? We may have to isolate the problem just a bit. One way would be to define the replicate with a minimal number of nodes, and then use the cdr change replicate command to add participants. That way we could narrow the problem down a bit. Since this is 9.2x, you aren't using the smartblob to provide disk storage for the queues. The control queue has it's own queue table and ALL control messages must be placed into stable storage. Therefor, it is possible that the control queue can't write the control message to disk because of some basic DB issue - possibably the disk storage space is exausted. Don't know for sure - just guessing. If that is the case, then you might not see the problem by using 'cdr change replcate' because you would be generating smaller control messages, and they might fit into the disk space. You do realize that 9.2 is a fairly old release... (enough said about that...) - ;-) Could you please send me some information about your site? I'm guessing that you're using a three level replication topology. I try to keep track of the more interesting customer sites ---- thanks... M.P. "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message news:1105349651.722628.220830@c13g2000cwb.googlegr oups.com... > hello all! > > I try to define a new replicate using cdr define repl and I receive > next error in online.log: > > 11:14:06 CDR Queuer: rqmQueueTxn (1000/0/0x2886/0x0) failed on CNTRL > 11:14:06 CDR GC: operation sparse control message queue failed (error > 0). > > the command is: > > cdr define repl -C ignore --scope=transaction --immed --ats --ris > --floatcanon rnom_g_cursvalext \ > "P bd_sco00@g_sco00:informix.curs_valutar_extins" "select * from > curs_valutar_extins" \ > "R bd_qbank_ghiseu@g_sco40:informix.curs_valutar_extins" "select * > from curs_valutar_extins" \ > "R bd_qbank_ghiseu@g_sco42:informix.curs_valutar_extins" "select * > from curs_valutar_extins" \ > "R bd_qbank_ghiseu@g_sco43:informix.curs_valutar_extins" "select * > from curs_valutar_extins" \ > ........... > ............ > "R bd_qbank_ghiseu@g_sco122:informix.curs_valutar_extins" "select * > from curs_valutar_extins" > > When I defined replicate shell (/bin/bash) no error returned; the > error appear only in online.log > > We have one root server, 22 nonroot server, 60 leaf server and about > 700 replicates already defined; all works ok. > We are running Unixware 7.1 and IDS 9.2 > |
| |||
| Thank you Madison, Indeed, we are using three level replication topology; we have defined one root server; this server is connected to other 15 nonroot servers and 7 leaf servers; each nonroot server is connected to other 5 or 6 leaf servers. Totally we have one root server, 15 nonroot servers and 68 leaf servers. Yesterday I defined replicate on root server with all nodes and I found next error in online.log: 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a50/0x0) failed on CNTRL 16:01:19 CDR GC: operation sparse control message queue failed (error 0). 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a51/0x0) failed on CNTRL 16:01:19 CDR GC: operation sparse control message queue failed (error 0). 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL 16:01:19 CDR GC: operation sparse control message queue failed (error 0). ........ ........ 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL 16:01:19 CDR GC: operation sparse control message queue failed (error 0). After that I found that the time daemon-ntpd- not running and about 10 nonroot servers are not syncronized. I tried to declare replicate with one nonroot server that are not syncronized ...in online log appears the same error: CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL CDR GC: operation sparse control message queue failed (error 0). When I defined replicate with one nonroot server that ARE syncronized with root server in online.log DON'T appears any error but replication don't work. In both cases (nonroot server syncronized with root server or not)replication don't work....on root server the replicate is defined but on nonroot server the replicate don't appears... I started ntpd and I syncronized servers; I restarted root server; after that I tried to define the replicate but don't work...... I am be able to connect between root server and nonroot server or leaf servers using dbaccess; there is no problem with disc storage space( hdd, rootdbs, sendqdbs,recvqdbs)on root server ; we have about 700 replicates already defined that works... Madison Pruet wrote: > Since this is an error while creating a sparse control message, I suspect > that the problem is being encountered trying to forward the control message > to one of the leaf servers. Are you defining the replicate on the root > node? Is the error occuring on the root node or on one of the non-root > servers? > > We may have to isolate the problem just a bit. One way would be to define > the replicate with a minimal number of nodes, and then use the cdr change > replicate command to add participants. That way we could narrow the problem > down a bit. > > Since this is 9.2x, you aren't using the smartblob to provide disk storage > for the queues. The control queue has it's own queue table and ALL control > messages must be placed into stable storage. Therefor, it is possible that > the control queue can't write the control message to disk because of some > basic DB issue - possibably the disk storage space is exausted. Don't know > for sure - just guessing. > > If that is the case, then you might not see the problem by using 'cdr change > replcate' because you would be generating smaller control messages, and they > might fit into the disk space. > > You do realize that 9.2 is a fairly old release... (enough said about > that...) - ;-) > > > Could you please send me some information about your site? I'm guessing > that you're using a three level replication topology. I try to keep track > of the more interesting customer sites ---- thanks... > > M.P. > "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message > news:1105349651.722628.220830@c13g2000cwb.googlegr oups.com... > > hello all! > > > > I try to define a new replicate using cdr define repl and I receive > > next error in online.log: > > > > 11:14:06 CDR Queuer: rqmQueueTxn (1000/0/0x2886/0x0) failed on CNTRL > > 11:14:06 CDR GC: operation sparse control message queue failed (error > > 0). > > > > the command is: > > > > cdr define repl -C ignore --scope=transaction --immed --ats --ris > > --floatcanon rnom_g_cursvalext \ > > "P bd_sco00@g_sco00:informix.curs_valutar_extins" "select * from > > curs_valutar_extins" \ > > "R bd_qbank_ghiseu@g_sco40:informix.curs_valutar_extins" "select * > > from curs_valutar_extins" \ > > "R bd_qbank_ghiseu@g_sco42:informix.curs_valutar_extins" "select * > > from curs_valutar_extins" \ > > "R bd_qbank_ghiseu@g_sco43:informix.curs_valutar_extins" "select * > > from curs_valutar_extins" \ > > ........... > > ............ > > "R bd_qbank_ghiseu@g_sco122:informix.curs_valutar_extins" "select * > > from curs_valutar_extins" > > > > When I defined replicate shell (/bin/bash) no error returned; the > > error appear only in online.log > > > > We have one root server, 22 nonroot server, 60 leaf server and about > > 700 replicates already defined; all works ok. > > We are running Unixware 7.1 and IDS 9.2 > > |
| |||
| See below... "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message news:1105427653.244482.159690@f14g2000cwb.googlegr oups.com... > Thank you Madison, > > Indeed, we are using three level replication topology; we have defined > one root server; this server is connected to other 15 nonroot servers > and 7 leaf servers; each nonroot server is connected to other 5 or 6 > leaf servers. Totally we have one root server, 15 nonroot servers and > 68 leaf servers. > > Yesterday I defined replicate on root server with all nodes and I > found next error in online.log: > > 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a50/0x0) failed on CNTRL > 16:01:19 CDR GC: operation sparse control message queue failed (error > 0). > 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a51/0x0) failed on CNTRL > 16:01:19 CDR GC: operation sparse control message queue failed (error > 0). > 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL > 16:01:19 CDR GC: operation sparse control message queue failed (error > 0). > ....... > ....... > 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL > 16:01:19 CDR GC: operation sparse control message queue failed (error > 0). Were these errors on the root node? Are there any leaf servers attached to the root node? > > > > After that I found that the time daemon-ntpd- not running and about 10 > nonroot servers are not syncronized. I doubt that this has anything to do with the problem. The error is indicating that there is a problem while putting the control message into the queue and that it has somthing to do with a sparse server. By sparse server, that means a leaf server. > > I tried to declare replicate with one nonroot server that are not > syncronized ...in online log appears the same error: > > CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL > CDR GC: operation sparse control message queue failed (error 0). > > When I defined replicate with one nonroot server that ARE syncronized > with root server in online.log DON'T appears any error but replication > don't work. Have you checked in the cdr error table yet? -- don't forget that you've got to start the replicate. The time difference can make a difference on 9.20 because of how time is treated. When you start the replicate, it doesn't really start until the machine clocks reach the same time. (This is true on 7.31 and 9.2... With 9.3, now means now - not the clock time of 'now' on the server that the command was issued.... ) > > In both cases (nonroot server syncronized with root server or > not)replication don't work....on root server the replicate is defined > but on nonroot server the replicate don't appears... > > I started ntpd and I syncronized servers; I restarted root server; > after that I tried to define the replicate but don't work...... > > I am be able to connect between root server and nonroot server or leaf > servers using dbaccess; there is no problem with disc storage space( > hdd, rootdbs, sendqdbs,recvqdbs)on root server ; we have about 700 > replicates already defined that works... > > > > > > > > Madison Pruet wrote: > > Since this is an error while creating a sparse control message, I > suspect > > that the problem is being encountered trying to forward the control > message > > to one of the leaf servers. Are you defining the replicate on the > root > > node? Is the error occuring on the root node or on one of the > non-root > > servers? > > > > We may have to isolate the problem just a bit. One way would be to > define > > the replicate with a minimal number of nodes, and then use the cdr > change > > replicate command to add participants. That way we could narrow the > problem > > down a bit. > > > > Since this is 9.2x, you aren't using the smartblob to provide disk > storage > > for the queues. The control queue has it's own queue table and ALL > control > > messages must be placed into stable storage. Therefor, it is > possible that > > the control queue can't write the control message to disk because of > some > > basic DB issue - possibably the disk storage space is exausted. > Don't know > > for sure - just guessing. > > > > If that is the case, then you might not see the problem by using 'cdr > change > > replcate' because you would be generating smaller control messages, > and they > > might fit into the disk space. > > > > You do realize that 9.2 is a fairly old release... (enough said about > > that...) - ;-) > > > > > > Could you please send me some information about your site? I'm > guessing > > that you're using a three level replication topology. I try to keep > track > > of the more interesting customer sites ---- thanks... > > > > M.P. > > "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message > > news:1105349651.722628.220830@c13g2000cwb.googlegr oups.com... > > > hello all! > > > > > > I try to define a new replicate using cdr define repl and I receive > > > next error in online.log: > > > > > > 11:14:06 CDR Queuer: rqmQueueTxn (1000/0/0x2886/0x0) failed on > CNTRL > > > 11:14:06 CDR GC: operation sparse control message queue failed > (error > > > 0). > > > > > > the command is: > > > > > > cdr define repl -C ignore --scope=transaction --immed --ats --ris > > > --floatcanon rnom_g_cursvalext \ > > > "P bd_sco00@g_sco00:informix.curs_valutar_extins" "select * from > > > curs_valutar_extins" \ > > > "R bd_qbank_ghiseu@g_sco40:informix.curs_valutar_extins" "select * > > > from curs_valutar_extins" \ > > > "R bd_qbank_ghiseu@g_sco42:informix.curs_valutar_extins" "select * > > > from curs_valutar_extins" \ > > > "R bd_qbank_ghiseu@g_sco43:informix.curs_valutar_extins" "select * > > > from curs_valutar_extins" \ > > > ........... > > > ............ > > > "R bd_qbank_ghiseu@g_sco122:informix.curs_valutar_extins" "select * > > > from curs_valutar_extins" > > > > > > When I defined replicate shell (/bin/bash) no error returned; the > > > error appear only in online.log > > > > > > We have one root server, 22 nonroot server, 60 leaf server and > about > > > 700 replicates already defined; all works ok. > > > We are running Unixware 7.1 and IDS 9.2 > > > > |
| |||
| yes, the errors were on the root server in online.log. In syscdr:cdr_error on root server i have no error in last days....there are old errors ... On the other servers- nonroot or leaf -don't appears any error but the replicate don't work because definition of replicate don't "arrived" here....I don't know why!!! ..the servers are connected...the old replicates works.... indeed, the root server is connected to 7 leaf nodes and to 15 nonroot nodes; the nonroot nodes are also connected with 63 leaf servers, other leaf servers that are connected with root server.... yesterday i tried to define the same replicate on one of nonroot servers( primary P is root server, R is one leaf server, replicate was defined on the nonroot server; then I used cdr change repl and I added all participants)..until now all works fine ...I inserted one row in the table on root server ... Result : with one leaf server connected directly with root server replication works...with other 6 leaf servers connected directly with root server replication don't works. the error in online.log : "CDR CDRDS: transaction aborted (Error in retrieving the replicate's attributes)" Replication works on 14 nonroot servers but in one nonroot server (and in leaf servers connected directly with this nonroot server) don't works...even if NO ERRORS found in online.log.... I don't understand why the replication works with some servers and with other servers don't work....I don't understand why I found any errors on some servers where replication don't works and on other servers where replication don't works I can't found errors.....especially if the replicates already defined works ok.... My fear is we must redefined whole replication system....I don't like this ...may appears some problems..and the activity is critical..I work in a bank and I can't break activity for a long period anyway in this year our intention is to migrate to IDS 9.4 and I hope some problems with replication system will be resolved.. Thank you Madison..I hope you understand what I want to say.... |
| ||||
| See below.. "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message news:1105517262.842212.62510@z14g2000cwz.googlegro ups.com... > yes, the errors were on the root server in online.log. In > syscdr:cdr_error on root server i have no error in last days....there > are old errors ... On the other servers- nonroot or leaf -don't appears > any error but the replicate don't work because definition of replicate > don't "arrived" here....I don't know why!!! ..the servers are > connected...the old replicates works.... There was probably some issue with the propogation of the replication metadata. > > indeed, the root server is connected to 7 leaf nodes and to 15 nonroot > nodes; the nonroot nodes are also connected with 63 leaf servers, other > leaf servers that are connected with root server.... > > yesterday i tried to define the same replicate on one of nonroot > servers( primary P is root server, R is one leaf server, replicate was > defined on the nonroot server; then I used cdr change repl and I added > all participants)..until now all works fine ...I inserted one row in > the table on root server ... > Result : with one leaf server connected directly with root server > replication works...with other 6 leaf servers connected directly with > root server replication don't works. the error in online.log : "CDR > CDRDS: transaction aborted (Error in retrieving the replicate's > attributes)" This means that the replicate definition failed on the target node. The source is sending stuff to the target, but the target doesn't know what to do with it or how to apply it. > > Replication works on 14 nonroot servers but in one nonroot server (and > in leaf servers connected directly with this nonroot server) don't > works...even if NO ERRORS found in online.log.... > > I don't understand why the replication works with some servers and with > other servers don't work....I don't understand why I found any errors > on some servers where replication don't works and on other servers > where replication don't works I can't found errors.....especially if > the replicates already defined works ok.... I'd need to understand the topology better to answer that one. > > My fear is we must redefined whole replication system....I don't like > this ...may appears some problems..and the activity is critical..I work > in a bank and I can't break activity for a long period I don't think that's necessary. We may have to do a bit of work with the syscdr database, however. Who is your support team? I might be able to give them some suggestions as to how to resolve the problem. > > > anyway in this year our intention is to migrate to IDS 9.4 and I hope > some problems with replication system will be resolved.. > Thank you Madison..I hope you understand what I want to say.... > |