This is a discussion on SunBlade 1500 lockup event - Please Help! within the Sun Solaris Hardware forums, part of the Solaris Operating System category; --> I am running a number of SunBlade 1500 workstations loaded with Solaris 8. (Yes, I know, it's an old ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I am running a number of SunBlade 1500 workstations loaded with Solaris 8. (Yes, I know, it's an old version but I have no choice). Recently there has been a series of events on some of the workstations where they lock up. Here are the symptoms: - The screen is black. - The system is unresponsive to the mouse and keyboard. - The power button is lit. - The LED to the ethernet and an LED on the motherboard are lit. - The fan cooling the RAM is running. - I cannot ping the machine. The only recourse I have it to hard reset the machine by holding down the power off button. Then the machine comes up again. It's happened within a set of a few machines in the last month; prior to that there were two incidents in the last six months that may or may not have been the same. It definiately feels like a new event. After the reboot I see nothing out of the ordinary in /var/adm. We also collect ps, vmstat, iostat, kvmstat, etc every ten minutes and log them. The system appeared more or less idle and there was nothing unusual happening. As far as I can tell, the system just stops working and sits there. I am completely baffled and have no idea what to do. Any suggestions would be very much appreciated. Scott Allen Abfalter AirNet Communications, Inc. sabfalter@airnetcom.com; thx@cfl.rr.com (321) 953-6818 |
| |||
| Scott Abfalter wrote: > I am running a number of SunBlade 1500 workstations loaded > with Solaris 8. (Yes, I know, it's an old version but I have no > choice). > > Recently there has been a series of events on some of the workstations > where they lock up. Here are the symptoms: > > - The screen is black. > > - The system is unresponsive to the mouse and keyboard. > > - The power button is lit. > > - The LED to the ethernet and an LED on the motherboard are lit. > > - The fan cooling the RAM is running. > > - I cannot ping the machine. > > The only recourse I have it to hard reset the machine by holding down > the power off button. Then the machine comes up again. > > It's happened within a set of a few machines in the last month; prior > to that there were two incidents in the last six months that may or may > not have been the same. It definiately feels like a new event. > > After the reboot I see nothing out of the ordinary in /var/adm. > > We also collect ps, vmstat, iostat, kvmstat, etc every ten minutes and > log them. The system appeared more or less idle and there was nothing > unusual happening. > > As far as I can tell, the system just stops working and sits there. > > I am completely baffled and have no idea what to do. Any suggestions > would be very much appreciated. > > Scott Allen Abfalter > AirNet Communications, Inc. > sabfalter@airnetcom.com; thx@cfl.rr.com > (321) 953-6818 > You don't get a crash dump, do you? Did you try to hook up a serial console to one of the machines and take a look what it is saying before it dies? Is it still accessible via the serial console? does stop-a still work? If stop-a does not work anymore, I would guess you are having a hardware defect. But it sounds strange that multiple machines have the same problem. Did you patch the systems recently? What patches do you have installed? Is your OBP up-to-date? Tom |
| |||
| Tom, Thanks for your thoughts. Stop-A does not work. It is completely unresponsive. It is Solaris 8 HW 05/03 with no patches (yes, yes, I know!). In any event, we have not patched recently. I have output and input device nvram settings to screen and not tty and so I assumed there was nothing there. I can try it. They are using whatever OBP was on there when they shipped (all within the last year), but that gives me another thing to check. Anyone else have anything else I can check on as well? Scott |
| |||
| Scott Abfalter <thx@cfl.rr.com> wrote: > I am running a number of SunBlade 1500 workstations loaded > with Solaris 8. (Yes, I know, it's an old version but I have no > choice). > Recently there has been a series of events on some of the workstations > where they lock up. Here are the symptoms: .... Did you turn on the power management/sleep mode? If so, turn that off, see if it resolves your problem. This is just a guess, but I remember using it on a few desktops and they would "sleep" but never come out of it correctly. You said the OS is unpatched so whatever bug caused that is probably not resolved on these. |
| ||||
| In article <d8d0qp$3co$1@news.xmission.com>, "B.M. Wright" <bmwright@xmission.xmission.com> wrote: >Scott Abfalter <thx@cfl.rr.com> wrote: > >> I am running a number of SunBlade 1500 workstations loaded >> with Solaris 8. (Yes, I know, it's an old version but I have no >> choice). > >> Recently there has been a series of events on some of the workstations >> where they lock up. Here are the symptoms: >.... This sounds a lot like a device is getting into an invalid state, and hanging the system's buss. I've seen this hung so bad the processor couldn't even access the buss, since the card was hanging it (an older UPA system like an Ultra60, where graphics cards sat on the processor buss). This could be something like a graphics card, or some other devic(*It's likely doing a DMA trransfer, and getting stuck, and hanging everything. I'd switch your console to the serial port, as someone else mentioned. If things really get messed up, it can often print to the serial port, when it can't talk to a graphics card. This is especially true if the problem IS the graphics card. Also, you might notice which slots cards are in. Sometimes this matters. All the SB1500 PCI slots are not on the same PCI buss. Some do power management slightly different, too. (disabling PM, mentioned below, is also a good thing) > > Did you turn on the power management/sleep mode? If so, turn >that off, see if it resolves your problem. This is just a guess, but I >remember using it on a few desktops and they would "sleep" but never >come out of it correctly. You said the OS is unpatched so whatever bug >caused that is probably not resolved on these. > |