This is a discussion on Re: Extremly long checkpoints: How to find the reason and solve the within the Informix forums, part of the Database Server Software category; --> I don't have the DSS Query right now, as a matter a fact , I don't even work on ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I don't have the DSS Query right now, as a matter a fact , I don't even work on the same company where I had that problem. In that time, the decision was to migrate to Enterprise Replication because of this problem and primary because of the replication granularity flexibility allowed by ER. For the record : From the Administration Guide for IDS 7.3 : (4354.pdf) : - Chapter 25 "What is High Availability Data Replication": (Page 578) "Checkpoints Between Database Servers" Checkpoint Between Database Servers in a High-Availability data-Replication pair are synchronous, regardless of the value of DRINTERVAL . A checkpoint on the primary database server completes only after it completes on the secondary database server. HDR Systems are not optimized for having the secondary server for DSS . Sounds logic because HDR is oriented for High Availability Systems, => OLTP Systems. In that time the company wanted the primary server for the Production System, and the Secondary Server for up to date Reports .. I learned this by the hard way .... I hope this message helps people who are evaluating HDR . Anyway thanks for your help, Regards ----- Original Message ----- From: "TBP" <TBP@Nospam.Nothere.Co.Uk> To: <informix-list@iiug.org> Sent: Wednesday, May 19, 2004 2:41 AM Subject: Re: Extremly long checkpoints: How to find the reason and solve the > Francisco Roldan wrote: > > Are you Replicating to other server ? > > Some time ago I got very long checkpoints in an > > Informix Server with High Availability Data Replication (HDR) > > on the primary server. > > > > I found out that the reason was an extremely complex query > > executed in the secondary server (Stand By Server) for generating > > a report (DSS Reports in OLTP System , not a good idea !! ). > > > > Chekpoints for HDR Systems are always Synchronous . > > It doesn't matter if you configure the system to be Asynchronous (I don't > > remember > > the name of the parameter in the Onconfig File), the only thing that really > > gets Asynchronous are the transactions (No 2-Fase Commit Protocol), > > the onconfig's parameter should be named TwoFaseCommit instead of > > the name that I don't remember. > > Primary Server Always wait an acknowledge message of the other servers > > for finishing its own checkpoint. > > > > If you are not replicating ignore this message, I just wanted to express > > my frustrating experience with HDR. > > Enterprise Replication (ER) would solve the problem. > > > > Regards > > > > > snip ... > > DRINTERVAL -1 (Synchronous) or DRINTERVAL > 0 Asynchronous > > There appears to be some activity which the checkpoint is dependent on > (not the checkpoint itself) which is synchronous; you can see this when > the checkpoint completes on the primary but hasn't started / completed > on the secondary when DRINTERVAL > 0. Something to do with flushing the > physical log buffer on the secondary, and threads in critical section. > > What was the "extremely complex query executed in the secondary server > (Stand By Server) for generating a report (DSS Reports in OLTP System , > not a good idea !! )." (I thought that was a nice way to split out DSS > from the OLTP primary by putting DSS on the secondary). > > Did you log a Tech Support case?? > > Were there a lot of writes involved on the secondary to temp tables? > sending to informix-list |
| ||||
| Francisco Roldan wrote: > HDR Systems are not optimized for having the secondary server for > DSS . Sounds logic because HDR is oriented for High Availability Systems, > => OLTP Systems. In that time the company wanted the primary server > for the Production System, and the Secondary Server for up to date Reports > . > I learned this by the hard way .... > I hope this message helps people who are evaluating HDR . > I've been working in an environment where customer uses secondary mainly for data extraction (DSS queries and jobs and extraction for DW). There were and still are problems regarding HDR which can cause the situation you mentioned. Some are related to heavy load on the secondary and one is (I think) caused by a (as of recently) known bug, related to critical sections and checkpoint holding on secondary. This situations MUST be considered as BUGS and not as "design orientation". We currently have an open case, under investigation by IBM technical support. I'd also like to point one fact: Around 9.30.UC1 there was a version which forced the same ONCONFIG parameters on primary and secondary for things like BUFFERS, CPUVPs etc. This would inhibit a configuration where the resources were different between primary and secondary. This was eventually considered a bug and corrected. This clearly states that you can use the secondary for different purposes then the primary. The normal and desirable behaviour, in case secondary as too much work is to stop replication and eventually recover. DRPINGTIMEOUT has also this objective. Of course that, if you intend your HDR environment to be mainly a stand-by database, you won't want this to happen, but in that case you shouldn't load the secondary too much... Regards, Fernando Nunes |