This is a discussion on Disk error with initial RAID 1 Sync within the Linux Operating System forums, part of the Unix Operating Systems category; --> Hi, I have a 1U server with dual Western Digital 80 GB HDs both on IDE Channel 1. I ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I have a 1U server with dual Western Digital 80 GB HDs both on IDE Channel 1. I would like to set both of thees drives up in a RAID 1 configuration. I am using Gentoo Linux and have been going through the configuration process. Previously, the server was running RH8 with RAID1 and I noticed that one drive was down. Before doing the full install of Gentoo, I ran a battery of tests on the drives with Western Digital's tools. Both hard drives passed without a single problem. Now I am configuring Gentoo and am finding that during the initial RAID1 sync., I get and error on one of the drives. I am stumped at this point because as far as I can tell the drives are good and should not be giving me this problem. If it helps, the MoBo is an MSI-6378 and the machine has 1 GB of RAM and is using Athlon-XP 2000+ CPU. Here are exact details of everything I did and the resulting error message that I received during the sync: Here is exact details. I re-ran this this afternoon and so this is fresh off the machine. I am installing the new OS on a machine with dual 80 GB HDs. I FDisked both drives with exactly matching partitions. Here is what the config looks like: Disk /dev/hda: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 5 40131 fd Linux raid autodetect /dev/hda2 6 68 506047+ 82 Linux swap /dev/hda3 69 9729 77601982+ fd Linux raid autodetect Command (m for help): (It is the same for both so all you need to do is replace hda with hdb) I loaded the RAID Kernel module and then created the following raidtab: # /boot (RAID 1) raiddev /dev/md0 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/hda1 raid-disk 0 device /dev/hdb1 raid-disk 1 # / (RAID 1) raiddev /dev/md2 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/hda3 raid-disk 0 device /dev/hdb3 raid-disk 1 I then went ahead and begin the synch process using the command mkraid /dev/md* where * is either 0 or 2. The initial synch of md0 goes without a hitch though I have to use mkraid -R since there is remnants from the last synch on the disks. I then begin to synch md2. Here is the output of /proc/mdstat during the md2 synch: Personalities : [raid1] read_ahead 1024 sectors md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1] ide/host0/bus0/target0/lun0/part3[0] 77601856 blocks [2/2] [UU] [=======>.............] resync = 37.5% (29110720/77601856) finish=36.7min speed=21993K/sec md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1] ide/host0/bus0/target0/lun0/part1[0] 40064 blocks [2/2] [UU] unused devices: <none> During the synch process on md2, the following errors appear. ( I do not know exactly when this occurred, but I know that it is after synching at least 80% of the drive): hdb: dma_timer_expiry: dma status == 0x61 hdb: timeout waiting for DMA ( repeats again ) hdb: (__ide_dma_test_irq) called while not waiting hda: status timeout: status=0xd0 { Busy } hda: drive not ready for command ide0: reset: success hdb: irq timeout: status=0xd0 { Busy } (2 more times) end_request: I/O error, dev 03:43 (hdb), sector 138982783 raid1: Disk faiulure one ide/host0/bus0/target1/lun0/part3, disbaling device Operation continuing on 1 devices hdb: status timeout: status=0xd0 { Busy } hdb: drive not ready for command ideo0: reset: success md2: no spare disk to reconstruct arraay! -- continuing in degraded mode hdb: irq timeout: status=0xd0 { Busy } ide0: rest: success hdb: irq timeout: stauts=0xd0 { Busy } (these two lines are repeated once more} end_request: I/O error, dev 03:43 (hdb), sector 138982911 And now cat /proc/mdstat looks like this: Personalities : [raid1] read_ahead 1024 sectors md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1](F) ide/host0/bus0/target0/lun0/part3[0] 77601856 blocks [2/1] [U_] md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1] ide/host0/bus0/target0/lun0/part1[0] 40064 blocks [2/2] [UU] unused devices: <none> The question I have now is what to troubleshoot. This feels like some sort of hardware problem, but I am not sure where to even start since the disks passed all tests. Anyone have any thoughts about this? TIA for any advice JL |
| |||
| jl_678@hotmail.com (Jay L) writes: >Hi, >I have a 1U server with dual Western Digital 80 GB HDs both on IDE >Channel 1. I would like to set both of thees drives up in a RAID 1 >configuration. You can do that, but I would strongly recommend putting the drives on separate IDE channels. Not only will the performance suffer drastically if you keep the drives in a MASTER/SLAVE setup on a single IDE channel, it will eventually cause problems if one of the drives startts to fail - as that might result in both drives becoming unavailable at the same time. [...] >hdb: dma_timer_expiry: dma status == 0x61 >hdb: timeout waiting for DMA ( repeats again ) >hdb: (__ide_dma_test_irq) called while not waiting >hda: status timeout: status=0xd0 { Busy } [...] Seems that /dev/hdb has DMA problems. This may or may not be the result of the drives being connected to the same IDE channel; DMA problems may also indicate faulty hardware, low-quality cables or driver problems. At the very least, try this setup with /dev/hda and /dev/hdc (that is, putting the SLAVE drive as MASTER on the secondary IDE channel). Michael -- Michael Buchenrieder * mibu@scrum.greenie.muc.de * http://www.muc.de/~mibu Lumber Cartel Unit #456 (TINLC) & Official Netscum Note: If you want me to send you email, don't munge your address. |
| |||
| Believe, it or not, I think that I found a solution for the problem. I pulled the drives and reviewed the jumpers and found that they were in "Cable Select" mode. According to the people who sold me the machine, this is the normal configuration and allows the system to automatically select a master/slave depending on the position in the IDE chain. I opted to change this and manually chose master and slave. (Master for the last drive in the chain and slave for the middle drive.) After making this change, my problems have disappeared! I will keep my eyes on it in case it is not a permanent solution, but so far so good. I wanted to post this here in case anyone else experiences this issue. JL Michael Buchenrieder <mibu@scrum.muc.de> wrote in message news:<Hqt8pF.JsI@scrum.muc.de>... > jl_678@hotmail.com (Jay L) writes: > > >Hi, > > >I have a 1U server with dual Western Digital 80 GB HDs both on IDE > >Channel 1. I would like to set both of thees drives up in a RAID 1 > >configuration. > > You can do that, but I would strongly recommend putting the drives > on separate IDE channels. Not only will the performance suffer > drastically if you keep the drives in a MASTER/SLAVE setup on > a single IDE channel, it will eventually cause problems if one of > the drives startts to fail - as that might result in both drives > becoming unavailable at the same time. > > [...] > > >hdb: dma_timer_expiry: dma status == 0x61 > >hdb: timeout waiting for DMA ( repeats again ) > >hdb: (__ide_dma_test_irq) called while not waiting > >hda: status timeout: status=0xd0 { Busy } > > [...] > > Seems that /dev/hdb has DMA problems. This may or may not be the > result of the drives being connected to the same IDE channel; > DMA problems may also indicate faulty hardware, low-quality > cables or driver problems. At the very least, try this setup > with /dev/hda and /dev/hdc (that is, putting the SLAVE drive > as MASTER on the secondary IDE channel). > > Michael |
| ||||
| On 8 Jan 2004 11:01:55 -0800, jl_678@hotmail.com (Jay L) wrote: >Believe, it or not, I think that I found a solution for the problem. I >pulled the drives and reviewed the jumpers and found that they were in >"Cable Select" mode. According to the people who sold me the machine, >this is the normal configuration and allows the system to >automatically select a master/slave depending on the position in the >IDE chain. I opted to change this and manually chose master and slave. >(Master for the last drive in the chain and slave for the middle >drive.) After making this change, my problems have disappeared! I will >keep my eyes on it in case it is not a permanent solution, but so far >so good. Your dealer is obviously a 15 year old kid. CS is a fairly recent development and it works every bit as well and consistantly as plug & pray. Repeat after me: I will NEVER allow the computer to make an important decision, I will do it myself, so I know that it is done properly. Cable Select, Plug & Play... standards would be wonderful, if only everyone implemented the same ones the same way! Mike- Mornings: Evolution in action. Only the grumpy will survive. ----------------------------------------------------- Please note - Due to the intense volume of spam, we have installed site-wide spam filters at catherders.com. If email from you bounces, try non-HTML, non-encoded, non-attachments. ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- |