View Single Post

   
  #4 (permalink)  
Old 01-16-2008, 08:19 AM
Georges Tomazi
 
Posts: n/a
Default Re: LVM / DiskSuite question


Peter -

On Sun, 30 Jan 2005 17:54:11 +0100, Peter C. Tribble wrote
(in article <ctj3fj$f0a$1@helium.hgmp.mrc.ac.uk>):

[...]

> So why did disksuite fail it? It must have had some reason for doing so that
> is likely to be logged somewhere. Disksuite is pretty aggressive at failing
> devices (it dosn't allow many errors before chucking it out completely) but
> I've always seen at least one error that explains it.


I found that error message in the logs:

Jan 29 22:45:40 gbr2-p40 scsi: [ID 107833 kern.warning] WARNING:
/pci@1f,0/pci@1,1/scsi@2/sd@2,0 (sd2):
Jan 29 22:45:40 gbr2-p40 SCSI transport failed: reason 'reset':
retrying command
Jan 29 22:45:48 gbr2-p40 md_stripe: [ID 641072 kern.warning] WARNING: md:
d33: read error on /dev/dsk/c0t2d0s3
Jan 29 22:45:48 gbr2-p40 md_mirror: [ID 842313 kern.info] NOTICE: md: d33:
B_FAILFAST I/O retry
Jan 29 22:46:00 gbr2-p40 md_stripe: [ID 641072 kern.warning] WARNING: md:
d33: read error on /dev/dsk/c0t2d0s3
Jan 29 22:46:00 gbr2-p40 md_mirror: [ID 104909 kern.warning] WARNING: md:
d33: /dev/dsk/c0t2d0s3 needs maintenance
Jan 29 22:46:00 gbr2-p40 md_stripe: [ID 241980 kern.notice] NOTICE: md: d33:
hotspared device /dev/dsk/c0t2d0s3 with /dev/dsk/c0t4d0s7

> I usuall (in the cases where the disk is responsive at all) do a
> format/analyze/read/repair and try replacing it just to see if it was a
> single bad block. Sometimes works.


I checked the grown defects list and it's still empty. The disk is a 73 Gb
Maxtor Atlas 10K IV bought in May 2004.

defect> prim
Extracting primary defect list...Extraction complete.
Defect List has a total of 684 defects.

defect> g
Extracting grown defects list...Extraction complete.
Defect List has a total of 0 defects.

[...]

> Seems complicated. How about just
>
> metareplace -e d3 c0t2d0s3


I tried and it worked. Thanks a lot ! Much simpler and easier than what I was
going to do.

[...]

> That's a concatenation. Don't think you want that...


Definitely not ;-)

[...]

> If the metareplace succeeds. And if the metareplace fails, the hot spare
> should still be in place.


You're right. When the metareplace started to resync the failed slice, the
hot spare switched back to "available".

So what do you think now ? Do you believe that a LVM software failure is
something possible or the drive is definitely dying ? Is it worth breaking
the mirror to reformat the disk and recreate the mirror ?

Thx again,

Georges

--
Georges Tomazi - gt@diapason.com

Reply With Quote