This is a discussion on ultra 60 not booting within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> Hi, I recently acquired a dual 450MHz Ultra 60 running Solaris v69. 1 GB memory and 2 18GB scsi ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I recently acquired a dual 450MHz Ultra 60 running Solaris v69. 1 GB memory and 2 18GB scsi disks. The solaris 10 install went fine, and I left the machine downloading some software. The next morning I found it had crashed, and refused to boot. I've tried a few times, and it seems to boot intermittently a few hours apart. In OBP I have run some tests and it fails every time I do a probe-scsi-all with an instruction mmu miss. I tried re-seating the scsi disks but it seems to have made no difference. Is there anything else I can try, or is this a dead end? Suggestions appreciated! Regards, Niall |
| |||
| Niall Dalton wrote: > Hi, > > I recently acquired a dual 450MHz Ultra 60 running Solaris v69. 1 GB > memory and 2 18GB scsi disks. > > The solaris 10 install went fine, and I left the machine downloading > some software. The next morning I found it had crashed, and refused to > boot. > > I've tried a few times, and it seems to boot intermittently a few hours > apart. In OBP I have run some tests and it fails every time I do a > probe-scsi-all with an instruction mmu miss. I tried re-seating the scsi > disks but it seems to have made no difference. Is there anything else I > can try, or is this a dead end? > > Suggestions appreciated! > > Regards, > Niall Check the fans. It's probably overheating. -- The e-mail address in our reply-to line is reversed in an attempt to minimize spam. Our true address is of the form che...@prodigy.net. |
| |||
| Niall Dalton wrote: > Hi, > > I recently acquired a dual 450MHz Ultra 60 running Solaris v69. 1 GB > memory and 2 18GB scsi disks. > > The solaris 10 install went fine, and I left the machine downloading > some software. The next morning I found it had crashed, and refused to > boot. > > I've tried a few times, and it seems to boot intermittently a few hours > apart. In OBP I have run some tests and it fails every time I do a > probe-scsi-all with an instruction mmu miss. I tried re-seating the scsi > disks but it seems to have made no difference. Is there anything else I > can try, or is this a dead end? Does it have an extra PCI SCSI card in it? If it does, then oik it out and try again. Also does probe-scsi (without the -all) fail with the MMU miss? Also from OBP, issue a reset-all - this has sorted stuff out for me previously. Finally have a look at scsi-initiator-id to see whether it is the default of 7. I have had truckloads of intermittent problems with SCSI recently :-( Prob of no help whatsoever :-( Apols. |
| |||
| Beardy wrote: > Does it have an extra PCI SCSI card in it? No > and try again. Also does probe-scsi (without the -all) fail with the MMU > miss? Also from OBP, issue a reset-all - this has sorted stuff out for > me previously. Finally have a look at scsi-initiator-id to see whether > it is the default of 7. I have had truckloads of intermittent problems > with SCSI recently :-( probe-all does fail, and I've tried a reset-all. I'm hampered by the machine often not booting at all, so I have limited scope to try things. I'll check into the scsi-initiator-id, I haven't looked at that yet. To the other poster, the fans are working fine and provide good airflow (and a remarkable imitation of a jet takeoff). Thanks for the suggestions! niall |
| |||
| Niall Dalton wrote: > Beardy wrote: > >> Does it have an extra PCI SCSI card in it? > > > No > >> and try again. Also does probe-scsi (without the -all) fail with the >> MMU miss? Also from OBP, issue a reset-all - this has sorted stuff out >> for me previously. Finally have a look at scsi-initiator-id to see >> whether it is the default of 7. I have had truckloads of intermittent >> problems with SCSI recently :-( > > > probe-all does fail, and I've tried a reset-all. I'm hampered by the > machine often not booting at all, so I have limited scope to try things. Niall, I actually suggested probe-scsi, not probe-all. When you say "often not booting at all" do you mean no OBP, or no UNIX boot? If you don't get to OBP, do you get power into the system? If you get OBP, but no UNIX boot, then what message(s) do you get when you enter "boot"? Or do you possibly have a dodgy/loose keyboard connector? It may be that it is sending output down the serial port. I'm grasping at straws..... |
| |||
| Hi Beardy, > Niall, I actually suggested probe-scsi, not probe-all. When you say Sorry - I wrote in haste - probe-scsi fails with the mmu problem. > "often not booting at all" do you mean no OBP, or no UNIX boot? If you > don't get to OBP, do you get power into the system? If you get OBP, but > no UNIX boot, then what message(s) do you get when you enter "boot"? Or > do you possibly have a dodgy/loose keyboard connector? It may be that it > is sending output down the serial port. I'm grasping at straws..... I'm going to get a serial cable tomorrow and try to dig a bit deeper - I've not devoted that much time in the last day or two to figure this out. In the last day I have not managed to get any output to the monitor - I'll yank out the video card once I have a serial cable. By "often not booting at all" I mean I get no signal on the monitor - no sign that even OBP is up and running. The power LED on the front panel does not even light, although the fans are up and running and there is a bit of disk activity from time to time. Definitely no unix boot - or at least not far enough to get the network interface up and running. Typing boot or boot cdrom on the keyboard (with a solaris install disk in the CD drive) seems to have no effect; I can't see anything, but I was hoping that at least I'd hear some disk activity or see the CD drive start reading. No such luck. Its possible my keyboard connection is dodgy, but I can manage to power up from the keyboard and get some flashing LEDs. The caps lock keys flashes for a while - I suspect letting me know its in POST. After that stops... nothing. Is it possible that a dead video card would stop the machine booting? My experience with non-working Sun boxes is very limited. I appreciate the suggestions! niall |
| |||
| Niall Dalton wrote: > Hi Beardy, Hello Niall, >> Niall, I actually suggested probe-scsi, not probe-all. When you say Cool. > Sorry - I wrote in haste - probe-scsi fails with the mmu problem. :-( > I'm going to get a serial cable tomorrow and try to dig a bit deeper - > I've not devoted that much time in the last day or two to figure this > out. You mean that your entire world does not revolve around your U60?!? Shame on you ;-) > In the last day I have not managed to get any output to the monitor > - I'll yank out the video card once I have a serial cable. You will need a null-modem cable. If you don't have one, search c.u.s in groups.google.com - the topic has been discussed at length. You must obv pull the Sun keyboard also, otherwise the serial comms will be ignored. > By "often not booting at all" I mean I get no signal on the monitor - no > sign that even OBP is up and running. The power LED on the front panel > does not even light, although the fans are up and running and there is a > bit of disk activity from time to time. Definitely no unix boot - or at > least not far enough to get the network interface up and running. No power LED sounds bad. Like system-board bad. The apparent disk activity could be self-generated. > Typing boot or boot cdrom on the keyboard (with a solaris install disk > in the CD drive) seems to have no effect; I can't see anything, but I > was hoping that at least I'd hear some disk activity or see the CD drive > start reading. No such luck. > > Its possible my keyboard connection is dodgy, but I can manage to power > up from the keyboard and get some flashing LEDs. The caps lock keys > flashes for a while - I suspect letting me know its in POST. After that > stops... nothing. Keyboard power-on is good, but otherwise :-( > Is it possible that a dead video card would stop the machine booting? My > experience with non-working Sun boxes is very limited. Mine too, unfortunately. In my experience, Sun kit is pretty robust. In PeeCee land I have seen dead video cards prevent any kind of POST. Err... does your U60 have a second frame buffer? ie. Do you have both 13W3 and VGA-style video outputs? If yes, then which are you using? If its the VGA-style one, then toss it and buy a 13W3 to VGA convertor. eBay is your friend, but make sure you get the correct gender-ed convertor. When you next get to OBP, do: setenv diag-switch? true setenv diag-level max power-off Then power on again, and hopefully POST will run. If you can snap the serial output to a file and post any POST errors here if they occur (hopefully not). > I appreciate the suggestions! > > niall No problemo. A U60 with 2X450MHz is worth fighting for IMHO. Although many would argue that (sod'em). Beardy. |
| |||
| > You mean that your entire world does not revolve around your U60?!? > Shame on you ;-) Indeed - in my defense I am but a humble programmer rather than an administrator. Ok, a systems programmer, so perhaps not that humble ;-) > Err... does your U60 have a second frame buffer? ie. Do you have both > 13W3 and VGA-style video outputs? If yes, then which are you using? If > its the VGA-style one, then toss it and buy a 13W3 to VGA convertor. > eBay is your friend, but make sure you get the correct gender-ed convertor. It has just the one, and I was using a 31w3 to vga convertor. > Then power on again, and hopefully POST will run. If you can snap the > serial output to a file and post any POST errors here if they occur > (hopefully not). Picked up the cable earlier, and managed to get POST to run. Doesn't look pretty. RAM errors and scsi devices not showing up... Thanks for the help on this! Hardware Power ONk #2 0 0 0 Master CPU onlinesTTnnPPCC==000000 Master Version: 0000.0000.1700.11a0 Probing Memory Bank #3 0 0 Slave Version: 0000.0000.1700.11a099ee88 TTSSTTAATTEE==00000000..000 CPU E$ (M) 0000.0000.0040.0000 (S) 0000.0000.0040.0000ÿ 0> <00> SC Reg Index Test TTLL==00 Button Power ON..00000000..000 Master CPU online000..00000000..00 Master Version: 0000.0000.1700.11a000000000..00000055 TTTT==00000000. Slave Version: 0000.0000.1700.11a0 TTPPCC==00000000..00000000. CPU E$ (M) 0000.0000.0040.0000 (S) 0000.0000.0040.0000000000..0000000000..00000000..e effffff..ff99ee44 TTnn Probing keyboard Doneeeffffff..ee333388 T %o0 = 0000.0000.0000.200104444..55660044..11440000 Executing Power On SelfTest8 TTSSTTAATTEE==00000000.. 0>44 0>@(#) Sun Ultra 60(UltraSPARC-II 2-way) UPA/PCI POST 2.0.2 10/19/1998 10:46 AM000000..0000000000..00000000..00000000000000..00 000000..00000044 TTTT==0000000 0>INFO: Processor 0 is master.USPECT= 0> 0> <00> Probe Ecache00..00000000..000000 0>INFO: CPU 450 MHz: 4096KB Ecache00000000..00000000..eeffffff..ee77 0> <00> Ecache RAM Addr Test000PP TTPPCC==0000000 0> <00> Ecache Tag Addr Test44 TTnnPPCC==00000000..0000 0> <00> Ecache Tag TestTSSTTAATTEE==00000000.. 0> <00> Invalidate Ecache Tags.556 00..eeffffff..ff99ee88 T 0>INFO: Processor 2 - UltraSPARC-II.0044..11440000EEDD SSttaattee EExx 0> <00> Init SC RegsT==00000000..0000000 0> <00> SC Address Reg Test0..00000000..00000000..0000 0> <00> SC Reg Index TestInt to mid ...00006644 0> <00> SC Regs Te 0> <00> Probe Memory0004444..55660044..1 0>INFO: 512MB Bank 0 0>INFO: 512MB Bank 1 RREEDD SSttaat 0>INFO: 0MB Bank 2000000..00000000..00 0>INFO: 0MB Bank 30..00000000..0000000 0> <00> Malloc Post Memory TTTT==00000000..00000000 0> <00> Init Post Memory00 ...00006644CC==000000 0> <00> Post Memory Addr Test0000..eeffffff..ff99ee44 TTn 0> <00> Map PROM/STACK/NVRAM in DMMU..eeffffff..ff99ee44 TTnnPPCC==0000 0> <00> Memory Stack Test7668 00..eeffffff..ff99ee 2> <00> DMMU TLB Tag Access Test444..55660044..11440000ff99ee88 2> <00> DMMU TLB RAM Access Test..55660044..11440000EEDD SSttaa 2> <00> IMMU TLB Tag Access Test00000..00000022 TTTT==00000000. 2> <00> IMMU TLB RAM Access Test00..00000044 TTTT==00000000..00 2> <00> Probe Ecache44.00 ...0000 0> <00> DMMU Little Endian Test00PPCC==00000000..0000000000..0 0> <00> IU ASI Access Test TTnnPPCC==00000000..00 0 0> <00> FPU ASI Access TestTAATTEE==00000000..00004444 2> <00> DMMU Hit/Miss Test60044..11440000ff99ee88 T 2> <00> IMMU Hit/Miss Test4444..55660044.. RREE 2> <00> DMMU Little Endian Testnn11 TTTT==00000000..00000000. 2> <00> IU ASI Access Test..00000000..00000000..0000 2> <00> FPU ASI Access Test ...00006644= 0000.0 2> <00> Dcache R 0> <1f> PIO Read Error, Target Abort Test00000000000..00000000..00000055 TTTT==00 0> <1f> PIO Write Error, Master Abort Test ...00006644CC==000000 TTPPCC==0000 0> <1f> PIO Write Error, Target Abort TestC==00000000..0000000000..00000000..eefffff 0> <1f> Timer Increment Test00..000000.eeffffff..ee77668 0> <00> Copy Post to MemorySTTAATTEE==00000000..000044 0> <00> Ecache Thrash Teste88 TTSSTTAATTEE==0000000 0> <00> Init Memory4..11440000EEDD SS 0> <00> Memory Addr w/ Ecache Test0000000..00000033 TTTT==00000000. 0>INFO: 512MB Bank 00000..00000000..0000 0>INFO: 512MB Bank 1.00 ...00 0>INFO: 0MB Bank 20006644 0>INFO: 0MB Bank 200000000..00000055 0>INFO: 0MB Bank 3000..0000 ...00006644 0> <00> Memory Status Test0000000..00000000..eefffff 0>INFO: 512MB Bank 0==00000000..000000PP 0>INFO: 512MB Bank 1000..00000000..eefff 0>INFO: 0MB Bank 2CC==00000000..00 00. 0>INFO: 0MB Bank 3TTSSTTAATTEE==000000 0> <00> V9 Instruction Test000044..55660044..11440000f 0> <00> CPU Tick and Tick Compare Reg Test55660044.. TTLL==00000000..00000000..0000 0> <00> CPU Soft Trap Test000..00000000..0000000000. 0> <00> CPU Softint Reg and Int Test00000044 TTTT==00000000..00 2> <00> V9 Instruction Test RREEDD SSttaattee E 0> <1f> PIO Decoder and BCT Test00..00000000..00000000 TTLL== 0> <1f> PCI Byte Enable Test.00000055 TTTT==00000000..0 0> <1f> Counter/Timer Limit Regs Test TTPPCC==00000000..00000000..eefff 0> <1f> Timer Reload Test0006644..000000PP 0> <1f> Timer Periodic Testeeffffff..ff99ee44 TTnnPPC 0> <1f> Mondo Int Map (short) Reg Test TTSSTTAATTEE==00000000..00004444..556 0> <1f> Mondo Int Set/Clr Reg Test..ff99ee88 TTSSTTAATTEE==00000000 0> <1f> Psycho IOMMU Regs TestEDD SSttaattee EExxcceepptti 0> <1f> Psycho IOMMU RAM Address Test00000 TTLL==00000000..00000000..00 0> <1f> Psycho IOMMU CAM Address Test ...00006644 0> <00> Test 0: prefetch_mr00000..0000000000..00000000 0> <00> Test 1: prefetch to non-cacheable page00000..00000000..0000 ...00006644 0> <00> Test 2: prefetch to page with dmmu misss4 TTnnPPCC==00000000..000000PPCC==00000000..000 0> <00> Test 3: prefetch miss does not check alignment00..00 00..eeffffff..ee776688 TTSSTTAATTEE==00000000. 0> <00> Test 4: prefetcha with asi 0x4c is noped00ff99ee88 TTSSTTAATTEE==00000000..00004444..55 0> <00> Test 5: prefetcha with asi 0x54 is nopedonn11 TTTT==00000000..00000000..00000000 TTL 0> <00> Test 6: prefetcha with asi 0x6e is noped00000000..00 ...00006644 0> <00> Test 20: prefetcha10_6: illegal instruction trapnnPPCC==00000000..000000PPCC==00000000..000000 0> <00> Test 21: prefetcha11_1w 00..eeffffff.. 0> <00> Test 22: prefetcha81_31..00004444..55660044..114400004 0> <00> Test 23: prefetcha11_15: illegal instruction trap TTLL==00000000..00000000..00000000..00 2> <00> UltraSPARC-2 Prefetch Instructions Test000000..00000000000000..00000000..00000055 TTT 2> <00> Test 0: prefetch_mr..00006644 2> <00> Test 1: prefetch to non-cacheable page TTnnPPCC==00000000..000000PPCC==00000000..00 2> <00> Test 2: prefetch to page with dmmu misss=00000000..00 00..eeffffff..ee333388 TTSSTTAATT 2> <00> Test 3: prefetch miss does not check alignment.11440000ff99ee88 TTSSTTAATTEE==00000000..00004444..5 2> <00> Test 4: prefetcha with asi 0x4c is noped000011 TTTT==00000000..00000000..0000000000..00 2> <00> Test 5: prefetcha with asi 0x54 is noped=00000000..00 ...00006644 RREEDD SSttaatt 2> <00> Test 10: prefetch with fcn 1200..00000000..00000000 TTLL==00000 2> <00> Test 11: prefetch with fcn 16 is noped00..00000000..000000000006644 2> <00> Test 12: prefetch with fcn 29 is noped TTnnPPC ...00006644..000000PP TTPPCC== Master CPU online Master Version: 0000.0000.1700.11a0 Slave Version: 0000.0000.1700.11a0 CPU E$ (M) 0000.0000.0040.0000 (S) 0000.0000.0040.0000 @(#) UPA/PCI 3.23 Version 1 created 1999/07/16 12:08 Clearing DTAGS Done Probing Memory Done MEM BASE = 0000.0000.2000.0000 MEM SIZE = 0000.0000.2000.0000 MMUs ON Copy Done PC = 0000.01ff.f000.2800 PC = 0000.0000.0000.2844 Decompressing into Memory Done Size = 0000.0000.0006.eb80 ttya initialized SC Control: EWP:0 IAP:0 FATAL:0 WAKEUP:0 BXIR:0 BPOR:0 SXIR:0 SPOR:1 POR:0 Probing Memory Bank #0 128 128 128 128 : 512 Megabytes Probing Memory Bank #1 128 128 128 128 : 512 Megabytes Probing Memory Bank #2 0 0 0 0 : 0 Megabytes Probing Memory Bank #3 0 0 0 0 : 0 Megabytes Data Access Error ok probe-scsi ok probe-scsi-all ok boot Boot device: net File and args: Can't open boot device ok |
| |||
| Just to draw a line under this one - we've managed to trace the error to a CPU. Remove that CPU and everything works just fine, first time. Remove the other devices, and the CPU still fails POST all on its own. Thanks for the help on this one. niall |
| ||||
| Niall Dalton wrote: > Just to draw a line under this one - we've managed to trace the error to > a CPU. Remove that CPU and everything works just fine, first time. > Remove the other devices, and the CPU still fails POST all on its own. > > Thanks for the help on this one. > > niall Well done Niall, this type of problem is pesky at best. |