This is a discussion on power events on V880 vs UltraAX-e2 within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> We have two Solaris systems: SunOS mendel 5.8 Generic_108528-19 sun4u sparc SUNW,UltraAX-e2 SunOS gec 5.8 Generic_108528-20 sun4u sparc SUNW,Sun-Fire-880 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| We have two Solaris systems: SunOS mendel 5.8 Generic_108528-19 sun4u sparc SUNW,UltraAX-e2 SunOS gec 5.8 Generic_108528-20 sun4u sparc SUNW,Sun-Fire-880 Each has a Tripplite UPS attached to /dev/ttyb, and they have exactly the same software controlling the UPS, which is: ftp://saf.bio.caltech.edu/pub/softwa...r-1_0_4.tar.gz Each has this modification to /etc/inittab: p3:s1234 /etc/genpower/genpowerfail is the same as /etc/init.d/genpowerd. When the power goes off on the UPS both detect it and initiate a shutdown. However then things go awry - in different ways. On the Ultra if power is restored genpowerd sees it and cancels the shutdown (kill -9 on the shutdown process's pid). Sort of. The system seems to be running normally in existing xterm sessions, but an attempt to ssh on is met with: NO LOGINS: System going down in 2 minutes THE POWER IS DOWN! PLEASE LOG OFF NOW! SHUTDOWN in 120 seconds The second line was the text associated with the original shutdown command. On the V880 power restoration events do not cancel the shutdown and the system shuts down. I've tried to kill the shutdown process manually on that system with a kill -9, and even though "shutdown" disappears from ps -ef, the V880 STILL goes down. Any ideas how to correct this? It's as if the grace period has different meanings on both systems (it says 120 seconds on each) and in neither case is it really a grace period before shutdown begins. Rather, shutdown begins and then you have 120 seconds (Ultra) and zip (V880) in which you can kill the shutdown process, but the system will not be in the same state as before shutdown was initiated. This is fairly annoying since power glitches don't bring down the hardware but trigger an unavoidable reboot on the V880 and require, eventually, a reboot on the Ultra to clear the odd state it ends up in. I tried commenting out every shutdown in the /etc/init.d/genpowerd file and on the V880 and it _still_ shutdown following a power event. At least the rc0d K* events were executed on the way down. Can somebody explain why the V880 ignores the grace period? Or how about why killing shutdown on the Ultra leaves it in the wacky semishutdown state? (In case anybody wonders: no I can't use PowerAlert Plus, at least on the V880, because pap_upsd dumps core after a half an hour or so when nothing much is happening, and crashes just before shutdown on a real loss of power event. The V880 is very stable otherwise.) Thanks, -- David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech |
| ||||
| [[ This message was both posted and mailed: see the "To," "Cc," and "Newsgroups" headers for details. ]] In article <20040205160910.289c2d22.mathog@caltech.edu>, David Mathog <mathog@caltech.edu> wrote: > On the Ultra if power is restored genpowerd sees it and cancels the > shutdown (kill -9 on the shutdown process's pid). Sort of. > The system seems to be running normally in existing xterm sessions, but an > attempt to ssh on is met with: > > NO LOGINS: System going down in 2 minutes > THE POWER IS DOWN! PLEASE LOG OFF NOW! SHUTDOWN in 120 seconds > > The second line was the text associated with the original shutdown command. I don't have any experience with the UPS software you are using, but I can tell you that the /etc/nologin file is what is keeping you from loggin into the Ultra system after the reboot is canceled. This file is created by shutdown to keep users from starting sessions that will end very soon with the reboot. Simply delete the file and you should be back up and running. Now why the genpowerd software doesn't get that file removed.... Are there any mailing lists for this UPS software? If you don't get other responses, this may be a good place to look. Steve |