See below...
"cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message
news:1105427653.244482.159690@f14g2000cwb.googlegr oups.com...
> Thank you Madison,
>
> Indeed, we are using three level replication topology; we have defined
> one root server; this server is connected to other 15 nonroot servers
> and 7 leaf servers; each nonroot server is connected to other 5 or 6
> leaf servers. Totally we have one root server, 15 nonroot servers and
> 68 leaf servers.
>
> Yesterday I defined replicate on root server with all nodes and I
> found next error in online.log:
>
> 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a50/0x0) failed on CNTRL
> 16:01:19 CDR GC: operation sparse control message queue failed (error
> 0).
> 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a51/0x0) failed on CNTRL
> 16:01:19 CDR GC: operation sparse control message queue failed (error
> 0).
> 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL
> 16:01:19 CDR GC: operation sparse control message queue failed (error
> 0).
> .......
> .......
> 16:01:19 CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL
> 16:01:19 CDR GC: operation sparse control message queue failed (error
> 0).
Were these errors on the root node? Are there any leaf servers attached to
the root node?
>
>
>
> After that I found that the time daemon-ntpd- not running and about 10
> nonroot servers are not syncronized.
I doubt that this has anything to do with the problem. The error is
indicating that there is a problem while putting the control message into
the queue and that it has somthing to do with a sparse server. By sparse
server, that means a leaf server.
>
> I tried to declare replicate with one nonroot server that are not
> syncronized ...in online log appears the same error:
>
> CDR Queuer: rqmQueueTxn (1000/0/0x2a52/0x0) failed on CNTRL
> CDR GC: operation sparse control message queue failed (error 0).
>
> When I defined replicate with one nonroot server that ARE syncronized
> with root server in online.log DON'T appears any error but replication
> don't work.
Have you checked in the cdr error table yet? -- don't forget that you've
got to start the replicate. The time difference can make a difference on
9.20 because of how time is treated. When you start the replicate, it
doesn't really start until the machine clocks reach the same time. (This is
true on 7.31 and 9.2... With 9.3, now means now - not the clock time of
'now' on the server that the command was issued.... )
>
> In both cases (nonroot server syncronized with root server or
> not)replication don't work....on root server the replicate is defined
> but on nonroot server the replicate don't appears...
>
> I started ntpd and I syncronized servers; I restarted root server;
> after that I tried to define the replicate but don't work......
>
> I am be able to connect between root server and nonroot server or leaf
> servers using dbaccess; there is no problem with disc storage space(
> hdd, rootdbs, sendqdbs,recvqdbs)on root server ; we have about 700
> replicates already defined that works...
>
>
>
>
>
>
>
> Madison Pruet wrote:
> > Since this is an error while creating a sparse control message, I
> suspect
> > that the problem is being encountered trying to forward the control
> message
> > to one of the leaf servers. Are you defining the replicate on the
> root
> > node? Is the error occuring on the root node or on one of the
> non-root
> > servers?
> >
> > We may have to isolate the problem just a bit. One way would be to
> define
> > the replicate with a minimal number of nodes, and then use the cdr
> change
> > replicate command to add participants. That way we could narrow the
> problem
> > down a bit.
> >
> > Since this is 9.2x, you aren't using the smartblob to provide disk
> storage
> > for the queues. The control queue has it's own queue table and ALL
> control
> > messages must be placed into stable storage. Therefor, it is
> possible that
> > the control queue can't write the control message to disk because of
> some
> > basic DB issue - possibably the disk storage space is exausted.
> Don't know
> > for sure - just guessing.
> >
> > If that is the case, then you might not see the problem by using 'cdr
> change
> > replcate' because you would be generating smaller control messages,
> and they
> > might fit into the disk space.
> >
> > You do realize that 9.2 is a fairly old release... (enough said about
> > that...) - ;-)
> >
> >
> > Could you please send me some information about your site? I'm
> guessing
> > that you're using a three level replication topology. I try to keep
> track
> > of the more interesting customer sites ---- thanks...
> >
> > M.P.
> > "cristizaharioiu" <cristizaharioiu@gmail.com> wrote in message
> > news:1105349651.722628.220830@c13g2000cwb.googlegr oups.com...
> > > hello all!
> > >
> > > I try to define a new replicate using cdr define repl and I receive
> > > next error in online.log:
> > >
> > > 11:14:06 CDR Queuer: rqmQueueTxn (1000/0/0x2886/0x0) failed on
> CNTRL
> > > 11:14:06 CDR GC: operation sparse control message queue failed
> (error
> > > 0).
> > >
> > > the command is:
> > >
> > > cdr define repl -C ignore --scope=transaction --immed --ats --ris
> > > --floatcanon rnom_g_cursvalext \
> > > "P bd_sco00@g_sco00:informix.curs_valutar_extins" "select * from
> > > curs_valutar_extins" \
> > > "R bd_qbank_ghiseu@g_sco40:informix.curs_valutar_extins" "select *
> > > from curs_valutar_extins" \
> > > "R bd_qbank_ghiseu@g_sco42:informix.curs_valutar_extins" "select *
> > > from curs_valutar_extins" \
> > > "R bd_qbank_ghiseu@g_sco43:informix.curs_valutar_extins" "select *
> > > from curs_valutar_extins" \
> > > ...........
> > > ............
> > > "R bd_qbank_ghiseu@g_sco122:informix.curs_valutar_extins" "select *
> > > from curs_valutar_extins"
> > >
> > > When I defined replicate shell (/bin/bash) no error returned; the
> > > error appear only in online.log
> > >
> > > We have one root server, 22 nonroot server, 60 leaf server and
> about
> > > 700 replicates already defined; all works ok.
> > > We are running Unixware 7.1 and IDS 9.2
> > >
>