Unix Technical Forum

Disk error with initial RAID 1 Sync

This is a discussion on Disk error with initial RAID 1 Sync within the Linux Operating System forums, part of the Unix Operating Systems category; --> Hi, I have a 1U server with dual Western Digital 80 GB HDs both on IDE Channel 1. I ...


Go Back   Unix Technical Forum > Unix Operating Systems > Linux Operating System

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-17-2008, 05:19 PM
Jay L
 
Posts: n/a
Default Disk error with initial RAID 1 Sync

Hi,

I have a 1U server with dual Western Digital 80 GB HDs both on IDE
Channel 1. I would like to set both of thees drives up in a RAID 1
configuration. I am using Gentoo Linux and have been going through
the configuration process. Previously, the server was running RH8
with RAID1 and I noticed that one drive was down. Before doing the
full install of Gentoo, I ran a battery of tests on the drives with
Western Digital's tools. Both hard drives passed without a single
problem.

Now I am configuring Gentoo and am finding that during the initial
RAID1 sync., I get and error on one of the drives. I am stumped at
this point because as far as I can tell the drives are good and should
not be giving me this problem. If it helps, the MoBo is an MSI-6378
and the machine has 1 GB of RAM and is using Athlon-XP 2000+ CPU.

Here are exact details of everything I did and the resulting error
message that I received during the sync:

Here is exact details. I re-ran this this afternoon and so this is
fresh off the machine. I am installing the new OS on a machine with
dual 80 GB HDs. I

FDisked both drives with exactly matching partitions. Here is what
the config looks like:

Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 1 5 40131 fd Linux raid
autodetect
/dev/hda2 6 68 506047+ 82 Linux swap
/dev/hda3 69 9729 77601982+ fd Linux raid
autodetect

Command (m for help):

(It is the same for both so all you need to do is replace hda with
hdb)

I loaded the RAID Kernel module and then created the following
raidtab:

# /boot (RAID 1)
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/hda1
raid-disk 0
device /dev/hdb1
raid-disk 1

# / (RAID 1)
raiddev /dev/md2
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/hda3
raid-disk 0
device /dev/hdb3
raid-disk 1

I then went ahead and begin the synch process using the command mkraid
/dev/md* where * is

either 0 or 2. The initial synch of md0 goes without a hitch though I
have to use mkraid -R since there is remnants from the last synch on
the disks. I then begin to synch md2. Here is the output of
/proc/mdstat during the md2 synch:


Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1]
ide/host0/bus0/target0/lun0/part3[0]
77601856 blocks [2/2] [UU]
[=======>.............] resync = 37.5% (29110720/77601856)
finish=36.7min

speed=21993K/sec
md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
ide/host0/bus0/target0/lun0/part1[0]
40064 blocks [2/2] [UU]

unused devices: <none>


During the synch process on md2, the following errors appear. ( I do
not know exactly when this occurred, but I know that it is after
synching at least 80% of the drive):

hdb: dma_timer_expiry: dma status == 0x61
hdb: timeout waiting for DMA ( repeats again )
hdb: (__ide_dma_test_irq) called while not waiting
hda: status timeout: status=0xd0 { Busy }

hda: drive not ready for command
ide0: reset: success
hdb: irq timeout: status=0xd0 { Busy } (2 more times)

end_request: I/O error, dev 03:43 (hdb), sector 138982783
raid1: Disk faiulure one ide/host0/bus0/target1/lun0/part3, disbaling
device
Operation continuing on 1 devices
hdb: status timeout: status=0xd0 { Busy }

hdb: drive not ready for command
ideo0: reset: success
md2: no spare disk to reconstruct arraay! -- continuing in degraded
mode
hdb: irq timeout: status=0xd0 { Busy }

ide0: rest: success
hdb: irq timeout: stauts=0xd0 { Busy } (these two lines are repeated
once more}

end_request: I/O error, dev 03:43 (hdb), sector 138982911

And now cat /proc/mdstat looks like this:

Personalities : [raid1]
read_ahead 1024 sectors
md2 : active raid1 ide/host0/bus0/target1/lun0/part3[1](F)
ide/host0/bus0/target0/lun0/part3[0]
77601856 blocks [2/1] [U_]

md0 : active raid1 ide/host0/bus0/target1/lun0/part1[1]
ide/host0/bus0/target0/lun0/part1[0]
40064 blocks [2/2] [UU]

unused devices: <none>

The question I have now is what to troubleshoot. This feels like some
sort of hardware problem, but I am not sure where to even start since
the disks passed all tests. Anyone have any thoughts about this?

TIA for any advice

JL
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-17-2008, 05:20 PM
Michael Buchenrieder
 
Posts: n/a
Default Re: Disk error with initial RAID 1 Sync

jl_678@hotmail.com (Jay L) writes:

>Hi,


>I have a 1U server with dual Western Digital 80 GB HDs both on IDE
>Channel 1. I would like to set both of thees drives up in a RAID 1
>configuration.


You can do that, but I would strongly recommend putting the drives
on separate IDE channels. Not only will the performance suffer
drastically if you keep the drives in a MASTER/SLAVE setup on
a single IDE channel, it will eventually cause problems if one of
the drives startts to fail - as that might result in both drives
becoming unavailable at the same time.

[...]

>hdb: dma_timer_expiry: dma status == 0x61
>hdb: timeout waiting for DMA ( repeats again )
>hdb: (__ide_dma_test_irq) called while not waiting
>hda: status timeout: status=0xd0 { Busy }


[...]

Seems that /dev/hdb has DMA problems. This may or may not be the
result of the drives being connected to the same IDE channel;
DMA problems may also indicate faulty hardware, low-quality
cables or driver problems. At the very least, try this setup
with /dev/hda and /dev/hdc (that is, putting the SLAVE drive
as MASTER on the secondary IDE channel).

Michael
--
Michael Buchenrieder * mibu@scrum.greenie.muc.de * http://www.muc.de/~mibu
Lumber Cartel Unit #456 (TINLC) & Official Netscum
Note: If you want me to send you email, don't munge your address.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-17-2008, 05:27 PM
Jay L
 
Posts: n/a
Default Re: Disk error with initial RAID 1 Sync

Believe, it or not, I think that I found a solution for the problem. I
pulled the drives and reviewed the jumpers and found that they were in
"Cable Select" mode. According to the people who sold me the machine,
this is the normal configuration and allows the system to
automatically select a master/slave depending on the position in the
IDE chain. I opted to change this and manually chose master and slave.
(Master for the last drive in the chain and slave for the middle
drive.) After making this change, my problems have disappeared! I will
keep my eyes on it in case it is not a permanent solution, but so far
so good.

I wanted to post this here in case anyone else experiences this issue.

JL

Michael Buchenrieder <mibu@scrum.muc.de> wrote in message news:<Hqt8pF.JsI@scrum.muc.de>...
> jl_678@hotmail.com (Jay L) writes:
>
> >Hi,

>
> >I have a 1U server with dual Western Digital 80 GB HDs both on IDE
> >Channel 1. I would like to set both of thees drives up in a RAID 1
> >configuration.

>
> You can do that, but I would strongly recommend putting the drives
> on separate IDE channels. Not only will the performance suffer
> drastically if you keep the drives in a MASTER/SLAVE setup on
> a single IDE channel, it will eventually cause problems if one of
> the drives startts to fail - as that might result in both drives
> becoming unavailable at the same time.
>
> [...]
>
> >hdb: dma_timer_expiry: dma status == 0x61
> >hdb: timeout waiting for DMA ( repeats again )
> >hdb: (__ide_dma_test_irq) called while not waiting
> >hda: status timeout: status=0xd0 { Busy }

>
> [...]
>
> Seems that /dev/hdb has DMA problems. This may or may not be the
> result of the drives being connected to the same IDE channel;
> DMA problems may also indicate faulty hardware, low-quality
> cables or driver problems. At the very least, try this setup
> with /dev/hda and /dev/hdc (that is, putting the SLAVE drive
> as MASTER on the secondary IDE channel).
>
> Michael

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-17-2008, 05:28 PM
Michael W. Cocke
 
Posts: n/a
Default Re: Disk error with initial RAID 1 Sync

On 8 Jan 2004 11:01:55 -0800, jl_678@hotmail.com (Jay L) wrote:

>Believe, it or not, I think that I found a solution for the problem. I
>pulled the drives and reviewed the jumpers and found that they were in
>"Cable Select" mode. According to the people who sold me the machine,
>this is the normal configuration and allows the system to
>automatically select a master/slave depending on the position in the
>IDE chain. I opted to change this and manually chose master and slave.
>(Master for the last drive in the chain and slave for the middle
>drive.) After making this change, my problems have disappeared! I will
>keep my eyes on it in case it is not a permanent solution, but so far
>so good.


Your dealer is obviously a 15 year old kid. CS is a fairly recent
development and it works every bit as well and consistantly as plug &
pray.

Repeat after me: I will NEVER allow the computer to make an important
decision, I will do it myself, so I know that it is done properly.

Cable Select, Plug & Play... standards would be wonderful, if only
everyone implemented the same ones the same way!

Mike-
Mornings: Evolution in action. Only the grumpy will survive.
-----------------------------------------------------

Please note - Due to the intense volume of spam, we have
installed site-wide spam filters at catherders.com. If
email from you bounces, try non-HTML, non-encoded,
non-attachments.


----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:01 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com