This is a discussion on HACMP 4.5 Node_Up Problem within the AIX Operating System forums, part of the Unix Operating Systems category; --> Hi, I am new to HACMP 4.5 and I just setup the simpliest config. to feel it. I create ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I am new to HACMP 4.5 and I just setup the simpliest config. to feel it. I create a two nodes cluster with one cascading resource group. I also add 3 adapters to the clusters, 2 boot adapters in which one for each node, and a service IP which does not allocate to any node. A heartbeat network is also setup using a serial cable. When startup the primary node, everything looks fine and the boot adapter get the service IP as alias to it. Then, I start the second node's cluster daemon. After a while, I find that the cluster daemon seems does not treat the second node startup successfully. Then about 300 seconds, I saw the following message in hacmp.out: ************************************************* Jul 21 15:47:20 EVENT START: config_too_long 360 /usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp :config_too_long[64] [[ high = high ]] :config_too_long[64] version=1.11 :config_too_long[65] :config_too_long[65] cl_get_path HA_DIR=es :config_too_long[67] NUM_SECS=360 :config_too_long[68] EVENT=/usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp :config_too_long[70] HOUR=3600 :config_too_long[71] THRESHOLD=5 :config_too_long[72] SLEEP_INTERVAL=1 :config_too_long[78] PERIOD=30 :config_too_long[81] set -u :config_too_long[86] LOOPCNT=0 :config_too_long[87] MESSAGECNT=0 :config_too_long[88] :config_too_long[88] cllsclstr -c :config_too_long[88] grep -v cname :config_too_long[88] cut -d : -f2 CLUSTER=HKT01CL01 :config_too_long[89] TIME=360 :config_too_long[90] sleep_cntr=0 :config_too_long[95] [ -x /usr/lpp/ssp/bin/spget_syspar ] WARNING: Cluster HKT01CL01 has been running recovery program '/usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp' for 360 seconds. Please check cluster status. WARNING: Cluster HKT01CL01 has been running recovery program '/usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp' for 390 seconds. Please check cluster status. WARNING: Cluster HKT01CL01 has been running recovery program '/usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp' for 420 seconds. Please check cluster status. WARNING: Cluster HKT01CL01 has been running recovery program '/usr/lpp/save.config/usr/es/sbin/cluster/events/node_up.rp' for 450 seconds. Please check cluster status. ************************************************* Looks like there are some problem on the node_up event in the second node. And so the node_up.rp is being run. Is it correct? If so, what's the possible cause for this? By the way, when I tried to use smitty to stop the clstermgr daemon, it's always in "stopping" status. Is it due to the above "node_up.rp" event and so the cluster daemon can't stop? The problem is about to drive me crazy. If you have some idea, please help me. Thanks a lot for the help in advance. Regards, Frankie |