Unix Technical Forum

RE: HDR unusable with 9.40

This is a discussion on RE: HDR unusable with 9.40 within the Informix forums, part of the Database Server Software category; --> Ajay, Is it fixed in 9.40uc5? When is it supposed to be released? Are there other known limitations in ...


Go Back   Unix Technical Forum > Database Server Software > Informix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-20-2008, 07:25 AM
Alexey Sonkin
 
Posts: n/a
Default RE: HDR unusable with 9.40


Ajay,

Is it fixed in 9.40uc5?
When is it supposed to be released?

Are there other known limitations in 9.40 HDR
implementation (like requirement for the secondary to be online
during index build) compared to 7.30, 9.2, 9.3?

P.S. I've just repeated experiment once again (replication on secondary was
restored from Level-0 backup) with exactly the same result:

1. All indexes were dropped successfully;
2. The first index was created on Primary;
3. Replication of that index was started on Secondary, at the same time next
index creation was started on Primary
4. In few seconds, both servers went into infinite checkpoint.

No CPU activity, no disk activity on both machines

------------------------------------------
Alexey Sonkin


-----Original Message-----
From: ajaykg68@yahoo.com [mailto:ajaykg68@yahoo.com]

Yes. This is a known bug and is fixed in later release of 9.40.

-Ajay Gupta

"Madison Pruet" <mpruet@comcast.net> wrote in message
news:<uyOSc.292803$Oq2.232844@attbi_s52>...
> Is there a known bug for this problem???
>
> "Ajay Gupta" <ajaykg68@yahoo.com> wrote in message
> news:26ad1e92.0408120908.57d4d1bb@posting.google.c om...
> > If you are transferring a big index to secondary, most of the buffers on
> > secondary are becoming dirty which may be source of the problem.
> >
> > Workaround for above problem is to increase the buffers on secondary
> > and force a checkpoint (onmode -c) between two create index on primary.
> >
> > Message 'index will be unusable on secondary' is just a warning.
> > If transfer of index from primary to secondary is aborted, you
> > get this message. Index on primary is fine and usable. A query
> > on secondary will not be able to use the index.
> >
> > -Ajay Gupta
> >
> > Alexey Sonkin <alexeis@grandvirtual.com> wrote in message

> news:<cfee79$18l$1@news.xmission.com>...
> > > Hi, everybody,
> > >
> > > I'm in a deep frustration about HDR quality and implementation in

> 9.40uc4
> > >
> > > In fact, 9.40 HDR is totally unusable: by creating a set of indexes

over
> a
> > > big table,
> > > one can easily get primary into 'forever blocked in checkpoint' and
> > > make secondary completely unusable (should be restored from archive)
> > >
> > > I was trying to create a set of 10 indexes over a rather big table
> > > in ANSI database on primary in HDR-replicated pair.
> > > The first index in a set of 10 indexes was created successfully.
> > >
> > > Soon after the second index creation was started, the primary server
> > > became blocked in a checkpoint. For about a minute after that, the
> > > secondary was trying to write something to a disk, then became silent.
> > > Both servers were blocked in a checkpoint.
> > >
> > > After several hours, I've stopped secondary.
> > > Primary became operational immediately.
> > >
> > > After the secondary was started again, index synchronization
> > > (of both indexes) was started (with proper message in 'online.log')
> > > from scratch.
> > > The first index was synchronized successfully, then second index

sync..
> > > and both server became blocked in checkpoint again.
> > >
> > > I repeated the experiment of starting/stopping secondary several

> times...
> > > And every time both servers were blocked at the synchronization
> > > of the second index.
> > >
> > > That's it. Secondary became totally unusable.
> > >
> > > I made an attempt to create index keeping secondary offline...
> > > One more interesting result: message appeared in 'online.log',
> > > indicating, that 'index will be unusable on secondary'.
> > > That is, 9.40, unlike 7.x and 9.21, doesn't allow to create
> > > indexes on primary with secondary offline.
> > >
> > > HDR is really, really bad in 9.40
> > >
> > > I'm going to file a level-1 bug to IBM/PA
> > >
> > > ------------------------------------------
> > > Alexey Sonkin
> > > Senior Database Administrator
> > >
> > >
> > >
> > > sending to informix-list

sending to informix-list
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-20-2008, 07:26 AM
Esteban Casuscelli
 
Posts: n/a
Default Re: HDR unusable with 9.40

Hi Alexey,

this is another silly problem that I found in 9.40.UC2 and reproduces
in 9.30.FC5.
I opened a tech support case with UK IBM, and after more than 10 days
waiting,( I sent them a test case ), they answered that they
reproduced the problem under 9.30.UC2 but the didn't reproduce the
problem under 9.30.UC3 so they refused to open a new bug and they
didn't give me a bug number.
This only happens if there is no activity on the primary server and
also see my workaround below.
I didn't have time to test it under 9.40.UC3 or later

hope this helps

You will see all the info I sent to IBM below:

New secondary crashes after fail over procedure
Testing HDR fail over and following the steps from Informix admin
manual new secondary crashes a few seconds after it changes its state
to secondary server. If you try to bring it back up using oninit it
will crash again with the same error.
I found a workaround for this problem, ( see below ), but if you don't
want to use the workaround the only way to re-create hdr is applying a
new physical restore.


I tested the same procedure under 9.30.FC5 Solaris 8 and 9.40.UC2
Linux redhat 9 ( kernel 2.4.20-8 )


The procedure I followed was:




1. Take primary instance (prdhdr) off line. Go first to
quiescent mode and then offline.
2. Execute hdrmkpri.sh prhdr on secondary server (secondary).
3. Execute hdrmksec.sh sechdr on primary server ( primary )
4. Bring new primary server backup using onint -v

which is the same than

Instance A (currently Primary) Instance B (currently
Secondary)
------------------------------
--------------------------------
1] onmode -ky (server should be up)
2] hdrmkpri.sh
<primary_server_name>
3] hdrmksec.sh <secondary_server_name>
(now a Secondary server) 4] oninit (now a Primary
server)



Secondary Server online message log

09:04:01 Loading Module <BUILTINNULL>
09:04:06 Dynamically allocated new virtual shared memory segment
(size 8192KB)
09:04:06 IBM Informix Dynamic Server Version 9.40.UC2 Software
Serial Number AAA#B000000
09:04:06 IBM Informix Dynamic Server Initialized -- Shared Memory
Initialized.

09:04:06 DR: Reservation of the last logical log for log backup
turned on
09:04:06 Data replication type and state information reset. To start
DR, use
the 'onmode -d' command and wait for the pair to be
operational,
before shutting down the database server

09:04:06 Physical Recovery Started at Page (1:1784).
09:04:06 Physical Recovery Complete: 0 Pages Examined, 0 Pages
Restored.
09:04:06 Dataskip is now OFF for all dbspaces
09:04:06 Restartable Restore has been ENABLED
09:04:06 Recovery Mode
09:04:08 DR: Reservation of the last logical log for log backup
turned off
09:04:08 DR: new type = secondary, primary server name = net940
09:04:08 DR: Trying to connect to primary server = net940
09:04:08 DR: Cannot connect to primary server
09:04:08 DR: Turned off on secondary server

[informix@appst archive]$ sh: line 1:
/usr/informix/etc/alarmprogram.sh: No such file or directory
sh: line 1: /usr/informix/etc/alarmprogram.sh: No such file or
directory

[informix@appst archive]$ onstat -m
shared memory not initialized for INFORMIXSERVER 'shmsec940'

Message Log File: /informix/940/online.log
Thread(19, dr_secapply, 4620d478, 1)
File: rshdr.c Line: 5497
09:05:01 Results: Dynamic Server must abort
09:05:01 Action: Reinitialize shared memory
09:05:01 stack trace for pid 4789 written to /tmp/af.3fb8b2d
09:05:01 See Also: /tmp/af.3fb8b2d, shmem.3fb8b2d.0
09:05:01 Process exited with return code 127: /bin/sh /bin/sh -c
/usr/informix/etc/alarmprogram.sh 3 15 "Data Replication failure."
"DR: Turned off on secondary server
09:05:05 rshdr.c, line 5497, thread 19, proc id 4789, DR: Log Record
Apply Thread Exited Abnormally. Internal Error.
A restart of the database server shall be required to
correct
this problem.
..
09:05:05 invoke_alarm(): /bin/sh -c
'/usr/informix/etc/alarmprogram.sh 5 6 "Internal Subsystem failure:
'MT'" "rshdr.c, line 5497, thread 19, proc id 4789, DR: Log Record
Apply Thread Exited Abnormally. Internal Error.
A restart of the database server shall be required to
correct
this problem.
.." '
09:05:05 invoke_alarm(): mt_exec failed, status 32512, errno 0
09:05:05 The Master Daemon Died
09:05:05 invoke_alarm(): /bin/sh -c
'/usr/informix/etc/alarmprogram.sh 5 6 "Internal Subsystem failure:
'MT'" "The Master Daemon Died" '
09:05:05 invoke_alarm(): mt_exec failed, status 32512, errno 0
09:05:05 PANIC: Attempting to bring system down
[informix@appst archive]$




Primary server online message log


Message Log File: /informix/940/online.log
09:05:02 DR: Failure recovery error (2)
09:05:02 Process exited with return code 127: /bin/sh /bin/sh -c
/usr/informix/etc/alarmprogram.sh 3 15 "Data Replication failure."
"DR: Local and Remote server type a
09:05:04 Physical Recovery Started at Page (1:1784).
09:05:04 Physical Recovery Complete: 0 Pages Examined, 0 Pages
Restored.
09:05:04 DR: Turned off on primary server
09:05:04 Logical Recovery Started.
09:05:04 10 recovery worker threads will be started.
09:05:04 Process exited with return code 127: /bin/sh /bin/sh -c
/usr/informix/etc/alarmprogram.sh 3 15 "Data Replication failure."
"DR: Turned off on primary server"
09:05:04 DR: Cannot connect to secondary server
09:05:04 Process exited with return code 127: /bin/sh /bin/sh -c
/usr/informix/etc/alarmprogram.sh 3 15 "Data Replication failure."
"DR: Cannot connect to secondary se
09:05:08 Logical Recovery has reached the transaction cleanup phase.
09:05:08 Logical Recovery Complete.
0 Committed, 0 Rolled Back, 0 Open, 0 Bad Locks

09:05:09 Dataskip is now OFF for all dbspaces
09:05:09 Checkpoint Completed: duration was 0 seconds.
09:05:09 Checkpoint loguniq 155, logpos 0x9018, timestamp: 10928966

09:05:09 Maximum server connections 0
09:05:09 On-Line Mode

[informix@st archive]$ onstat -g dri

IBM Informix Dynamic Server Version 9.40.UC2 -- On-Line (Prim) --
Up 00:00:26 -- 49744 Kbytes

Data Replication:
Type State Paired server Last DR CKPT (id/pg)
primary off netsec940 155 / 7

DRINTERVAL 30
DRTIMEOUT 30
DRLOSTFOUND /usr/informix/etc/dr.lostfound



By the way I found a workaround for this, if you follow this procedure
secondary server and HDR will stay up and running.

1. Take primary instance (prdhdr) off line. Go first to quiescent
mode and then offline.
2. Execute hdrmkpri.sh prhdr on secondary server (secondary).
3. Bring the instance back up on secondary using oninit -v.
4. Execute onmode -l on sechdr to switch to the next logical log file.
5. Execute onmode -c on sechdr to write a new checkpoint record.
6. Take sechdhr offline. Go first to quiescent mode and then offline.
7. Execute hdrmksec.sh sechdr on primary server ( primary )
8. Check with onstat -l where is the last checkpoint record on
primary.
9. Bring new primary server back up using oninit -v
10. Execute onstat -m and onstat -g dri to make sure that HDR is up
and running.





Alexey Sonkin <alexeis@grandvirtual.com> wrote in message news:<cfhgl8$fgl$1@news.xmission.com>...
> Ajay,
>
> Is it fixed in 9.40uc5?
> When is it supposed to be released?
>
> Are there other known limitations in 9.40 HDR
> implementation (like requirement for the secondary to be online
> during index build) compared to 7.30, 9.2, 9.3?
>
> P.S. I've just repeated experiment once again (replication on secondary was
> restored from Level-0 backup) with exactly the same result:
>
> 1. All indexes were dropped successfully;
> 2. The first index was created on Primary;
> 3. Replication of that index was started on Secondary, at the same time next
> index creation was started on Primary
> 4. In few seconds, both servers went into infinite checkpoint.
>
> No CPU activity, no disk activity on both machines
>
> ------------------------------------------
> Alexey Sonkin
>
>
> -----Original Message-----
> From: ajaykg68@yahoo.com [mailto:ajaykg68@yahoo.com]
>
> Yes. This is a known bug and is fixed in later release of 9.40.
>
> -Ajay Gupta
>
> "Madison Pruet" <mpruet@comcast.net> wrote in message
> news:<uyOSc.292803$Oq2.232844@attbi_s52>...
> > Is there a known bug for this problem???
> >
> > "Ajay Gupta" <ajaykg68@yahoo.com> wrote in message
> > news:26ad1e92.0408120908.57d4d1bb@posting.google.c om...
> > > If you are transferring a big index to secondary, most of the buffers on
> > > secondary are becoming dirty which may be source of the problem.
> > >
> > > Workaround for above problem is to increase the buffers on secondary
> > > and force a checkpoint (onmode -c) between two create index on primary.
> > >
> > > Message 'index will be unusable on secondary' is just a warning.
> > > If transfer of index from primary to secondary is aborted, you
> > > get this message. Index on primary is fine and usable. A query
> > > on secondary will not be able to use the index.
> > >
> > > -Ajay Gupta
> > >
> > > Alexey Sonkin <alexeis@grandvirtual.com> wrote in message

> news:<cfee79$18l$1@news.xmission.com>...
> > > > Hi, everybody,
> > > >
> > > > I'm in a deep frustration about HDR quality and implementation in

> 9.40uc4
> > > >
> > > > In fact, 9.40 HDR is totally unusable: by creating a set of indexes

> over
> a
> > > > big table,
> > > > one can easily get primary into 'forever blocked in checkpoint' and
> > > > make secondary completely unusable (should be restored from archive)
> > > >
> > > > I was trying to create a set of 10 indexes over a rather big table
> > > > in ANSI database on primary in HDR-replicated pair.
> > > > The first index in a set of 10 indexes was created successfully.
> > > >
> > > > Soon after the second index creation was started, the primary server
> > > > became blocked in a checkpoint. For about a minute after that, the
> > > > secondary was trying to write something to a disk, then became silent.
> > > > Both servers were blocked in a checkpoint.
> > > >
> > > > After several hours, I've stopped secondary.
> > > > Primary became operational immediately.
> > > >
> > > > After the secondary was started again, index synchronization
> > > > (of both indexes) was started (with proper message in 'online.log')
> > > > from scratch.
> > > > The first index was synchronized successfully, then second index

> sync..
> > > > and both server became blocked in checkpoint again.
> > > >
> > > > I repeated the experiment of starting/stopping secondary several

> times...
> > > > And every time both servers were blocked at the synchronization
> > > > of the second index.
> > > >
> > > > That's it. Secondary became totally unusable.
> > > >
> > > > I made an attempt to create index keeping secondary offline...
> > > > One more interesting result: message appeared in 'online.log',
> > > > indicating, that 'index will be unusable on secondary'.
> > > > That is, 9.40, unlike 7.x and 9.21, doesn't allow to create
> > > > indexes on primary with secondary offline.
> > > >
> > > > HDR is really, really bad in 9.40
> > > >
> > > > I'm going to file a level-1 bug to IBM/PA
> > > >
> > > > ------------------------------------------
> > > > Alexey Sonkin
> > > > Senior Database Administrator
> > > >
> > > >
> > > >
> > > > sending to informix-list

> sending to informix-list

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:44 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com