View Single Post

   
  #4 (permalink)  
Old 01-16-2008, 06:09 PM
Jim Carter
 
Posts: n/a
Default Re: ServiceGuard: cmcld: lan0 failed

Ulrich,
one advantage of having multiple NICs within the same server is that a
single NIC failure becomes a local event and the package won't cycle down
/ up because of the failure. The pointers on the TCP stack are just
changed to make use of the other NIC.

Another advantage is that if you see the "failure" message on both NICs at
the same time, you can most likely eliminate the cables or the NICs as the
problem. Then, if you have both NICs connected to separate switches, you
can eliminate the switches because both probably didn't fail at the same
time. That leaves the networking configuration or software on the server,
OR a network storm that made the server think the NICs had failed. You can
gind that by using tracing and logging on the network in question.

Good luck on this troubleshooting...


>>
>> Ulrich Windl <Ulrich.Windl@RZ.Uni-Regensburg.DE> wrote in message
>> news:<m31xuno8dn.fsf@pc5234.klinik.uni-regensburg.de>...
>> > Hello,
>> >
>> > at the moment we are running the latest ServiceGuard release with
>> > Support Plus patches from June 2003 on a two-node-cluster consisting
>> > of two L3000 running HP-UX 11.11. The system are connected via a 100
>> > Base-T LAN and a 1000 Base-T LAN.
>> >
>> > Since May we had two network faults reported by cmcld that were not
>> > reproducible. SG had stopped the packages depending on the LAN
>> > interface, so it was not very good.
>> >
>> > I've watched /var/adm/nettl*, and all I could see was a "...bad cable
>> > connection...", "...going Offline @ [0/0/0/0] [Cable disconnected]".
>> >
>> > What worries me is that nettl never reports when the LAN interface is
>> > up again. According to cmcld the outage lasted exactly for 20
>> > seconds. Last time it lasted exactly for 30 seconds.
>> >
>> > I guess that the driver is doing nonsense; maybe if the load is high.
>> > Can somebody explain what condition exactly triggers the "cable
>> > disconnected" message in the driver?
>> >
>> > Any insights?

Reply With Quote