vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, My SB1K ( dual 750MHz/4GB running S9 08/03 patched late last year) just crashed for no apparent reason. I have appended the relevant /var/adm/messages log at the end of the post. I tried using adb which came up with the following; # adb -k unix.0 vmcore.0 physmem 7bcfa I also tried running iscda (that a number of NG threads mentioned) however the script went into an loop spewing out errors. I guess I could have tried mdb, but I must confess that I neither have the time, nor the inclination =) Anyone have any ideas what may have caused the problem? Thanks and regards, Andrew Jan 26 00:07:48 horus SUNW,UltraSPARC-III: [ID 326222 kern.warning] WARNING: [AF T1] Timeout (TO) Event detected by CPU0 Privileged Data Access at TL=0, errID 0x 00000c74.79225e80 Jan 26 00:07:48 horus AFSR 0x00001000<TO>.00000000 AFAR 0x000007f8.00610900 Jan 26 00:07:48 horus Fault_PC 0x134061c Jan 26 00:07:48 horus unix: [ID 836849 kern.notice] Jan 26 00:07:48 horus ^Mpanic[cpu0]/thread=2a10007dd40: Jan 26 00:07:48 horus unix: [ID 855895 kern.notice] [AFT1] errID 0x00000c74.7922 5e80 TO Error(s) Jan 26 00:07:48 horus See previous message(s) for details Jan 26 00:07:48 horus unix: [ID 100000 kern.notice] Jan 26 00:07:48 horus genunix: [ID 723222 kern.notice] 000002a10007d360 SUNW,Ult raSPARC-III:cpu_aflt_log+5bc (2a10007d46b, 1, 2a10007d678, 10, 117be60, 117be88) Jan 26 00:07:49 horus genunix: [ID 179002 kern.notice] %l0-3: 0000000001491ea0 0000000000000010 0000000000000003 000002a10007d678 Jan 26 00:07:49 horus %l4-7: 0000100000000000 0000000000000000 000002a10007d5a 8 000002a10007d41e Jan 26 00:07:49 horus genunix: [ID 723222 kern.notice] 000002a10007d5b0 SUNW,Ult raSPARC-III:cpu_deferred_error+4d0 (0, 1, 4000100003200000, 40001000, 7f8, 1) Jan 26 00:07:49 horus genunix: [ID 179002 kern.notice] %l0-3: 000002a10007d678 0000000400000000 4000100003200000 0000030001531928 Jan 26 00:07:49 horus %l4-7: 0000000000000001 000002a10007d9f0 0000030001855e9 0 0000000080000000 Jan 26 00:07:49 horus genunix: [ID 723222 kern.notice] 000002a10007d940 unix:ktl 0+48 (1, 1c000, 10220, 10278, 10290, 10270) Jan 26 00:07:49 horus genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000001400 0000004400001607 0000000001172880 Jan 26 00:07:49 horus %l4-7: 00000300018b62b8 0000030001855ea0 000000000000000 5 000002a10007d9f0 Jan 26 00:07:49 horus genunix: [ID 723222 kern.notice] 000002a10007da90 pcisch ci_intr_wrapper+7c (300003d5818, 22a, 1400000, 2a10007dd40, 4540, 13405f0) Jan 26 00:07:49 horus genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 000003000154e000 00000300016b8000 0000000000000001 Jan 26 00:07:49 horus %l4-7: 000003000154e028 000003000000e6d8 00000000014a880 0 00000000014a8800 Jan 26 00:07:50 horus unix: [ID 100000 kern.notice] Jan 26 00:07:50 horus genunix: [ID 672855 kern.notice] syncing file systems... Jan 26 00:08:20 horus unix: [ID 836849 kern.notice] Jan 26 00:08:20 horus ^Mpanic[cpu0]/thread=2a10007dd40: Jan 26 00:08:20 horus unix: [ID 715357 kern.notice] panic sync timeout Jan 26 00:08:20 horus unix: [ID 100000 kern.notice] Jan 26 00:08:20 horus genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c1t1d 0s1, offset 859373568, content: kernel Jan 26 00:08:21 horus genunix: [ID 409368 kern.notice] ^M100% done: 35873 pages dumped, compression ratio 3.67, Jan 26 00:08:21 horus genunix: [ID 851671 kern.notice] dump succeeded |
| ||||
| Andrew Tyson wrote: > Hi, > > My SB1K ( dual 750MHz/4GB running S9 08/03 patched late last year) just > crashed for no apparent reason. I have appended the relevant > /var/adm/messages log at the end of the post. I tried using adb which > came up with the following; > > Jan 26 00:07:48 horus SUNW,UltraSPARC-III: [ID 326222 kern.warning] > WARNING: [AF > T1] Timeout (TO) Event detected by CPU0 Privileged Data Access at TL=0, > errID 0x > 00000c74.79225e80 > Jan 26 00:07:48 horus AFSR 0x00001000<TO>.00000000 AFAR > 0x000007f8.00610900 > Jan 26 00:07:48 horus Fault_PC 0x134061c > Jan 26 00:07:48 horus unix: [ID 836849 kern.notice] > Jan 26 00:07:48 horus ^Mpanic[cpu0]/thread=2a10007dd40: > Jan 26 00:07:48 horus unix: [ID 855895 kern.notice] [AFT1] errID > 0x00000c74.7922 > 5e80 TO Error(s) This is a TO (timeout) error from the Safari databus from an access to an address that claimed to be mapped (ie, some device claims to have implemented that physical address range) but for which data did not return within the time expected. The fault address 0x000007f8.00610900 I think looks like a device address rather than real cacheable memory, but I don't know which device that would map to. Chances are that there's a bad or marginal PCI card, I'd guess. Incidentally Solaris 10 makes all this hugely more elegant. Gavin |
| Thread Tools | |
| Display Modes | |
|
|