vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| We have Disksuite set up on this server. Last week, parts of our Oracle base was in a hung state, and a reboot of the server had to be made in order to get it back and working. After the reboot it took almost one hour before it got back up. Turned out that 4 disk drives were kaput and had to be replaced. However as it was natural to suspect that these disk errors caused the DB to hang, no errors were recorded _prior_ to the reboot (I found disk errors in some historical logs, but these were between Apr 21st and May 26th - before the DB hanging, no disk errors were reported for 48 days). So far what investigations show is that for some reason, the DB didn't write a generated file to a disk. Wether this was the cause or if something happened in the DB prior to this is still unknown. I've checked some more and the slice on whose disk it tried to write to, was d5 consisting of sub mirrors d15 and d25. d15 and d25 was not one of the failing disks, but when running iostat -e I notice (output truncated): ---- errors --- device s/w h/w trn tot sd1 0 0 0 0 sd61 0 18 86 104 sd61 is sub mirror d25 and is obviously experiencing errors. This was not discovered when replacing the failed disks, so question is, are these errors severe? Should I look into getting that disk replaced immediately, and *could* those errors have part in the DB failing to write the file to disk, even though nothing is recorded in syslog (I'm just speculating, a sync operation which hung or something?) Since Disksuite is a pretty much unchartered territory for me, all suggestions and input is welcome. -- Stig Bull | remove .no.spam from my email address to reply by mail | No animals were hurt or killed in the process of creating this electronic message. To reduce download time, this message is made of 100% recycled bytes. |