This is a discussion on Are Suns fussy about fibre channel disks?? within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> In comp.sys.sun.hardware Daniel Rock <v200739@deadcafe.de> wrote: > In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >> I'd say a disk that ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| In comp.sys.sun.hardware Daniel Rock <v200739@deadcafe.de> wrote: > In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >> I'd say a disk that doesn't always spin up correctly is not a good place >> to store data. > > So don't let it spin down. > > The disk is in a workstation. If one fails the mirror is still Ok. If both > fail, I can restore the data from the backup. Do you drive around with a flat tire? Three out of four isn't too bad. |
| |||
| In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: > Do you drive around with a flat tire? Three out of four isn't too bad. Bad analogy. Do you change your LCD screen if it shows a bad pixel? -- Daniel |
| |||
| In comp.sys.sun.hardware Daniel Rock <v200740@deadcafe.de> wrote: > In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >> Do you drive around with a flat tire? Three out of four isn't too bad. > > Bad analogy. > > Do you change your LCD screen if it shows a bad pixel? If my display has two pixels, and I know one is broken to start with, yes, I replace it. |
| |||
| In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: > If my display has two pixels, and I know one is broken to start with, yes, > I replace it. Do you replace your car if you jump-started it once? -- Daniel |
| |||
| In comp.sys.sun.admin Douglas O'Neal <oneal@dbi.udel.edu> wrote: > Since you know you will power cycle at some point, you have a 9% chance > of losing your data at that point. Not a risk I'd take. You are assuming that the drive will never again spin up after a power cycle. This assumption is flawed. -- Daniel |
| |||
| In comp.sys.sun.hardware Daniel Rock <v200740@deadcafe.de> wrote: > In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >> In comp.sys.sun.hardware Daniel Rock <v200740@deadcafe.de> wrote: >>> In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >>>> I prefer preventative maintenance, not cleaning up larger messes later. >>> >>> A few "metattach" or "metareplace" are a larger mess? >> >> Some people have slightly different standards, and know that drives don't >> fix themselves, and always just get worse and should be replaced at the >> first signs of trouble, at your convenience, not when they finally do >> catastrophically fail. > > Do you just replace a flat tire or the entire car? the car/stires analogy is best summed up as the person that uses broken drives puts leaky tires on their car, and hopes it doesn't go flat, and if it does, they're fine with three good tires- and they saved a a few dollars because they're witty. > Let's calculate the probability of a total failure... > > Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives > is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours > to replace a broken drive. > > What is the probability that two drives fail within 48 hours? more than you'd expect. I've seen plenty of double disk failures. > The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365)) I can simplify that equation into: it's stupid to put broken disks back into a machine, no matter what nonsense math you try to justify it with. > > BTW this is the SMART output of one of the drives: > > Device: SEAGATE SX3146807FC Version: D010 > Device type: disk > Transport protocol: Fibre channel (FCP-2) > Device supports SMART and is Enabled > Temperature Warning Disabled or Not Supported > SMART Health Status: OK > > Elements in grown defect list: 8 > Vendor (Seagate) cache information > Blocks sent to initiator = 323870666916662 > Vendor (Seagate/Hitachi) factory information > number of hours powered up = 27478.45 > number of minutes until next internal SMART test = 10 You're blinding yourself. You know the drive doesn't always spin up. No amount of smart data cancels that out. just throw the drive out or RMA it. |
| |||
| Daniel Rock wrote: > In comp.sys.sun.admin Douglas O'Neal <oneal@dbi.udel.edu> wrote: >> Since you know you will power cycle at some point, you have a 9% chance >> of losing your data at that point. Not a risk I'd take. > > You are assuming that the drive will never again spin up after a power > cycle. > > This assumption is flawed. Agreed, my probability is too high. But the point is that the 0.05% catastrophic probability you calculated is too low. And if we take a number somewhere in the middle, say 0.5% chance of catastrophic failure per power cycle, that would be way too high for me to trust with critical data. |
| |||
| In comp.sys.sun.admin Huge <Huge@nowhere.much.invalid> wrote: > On 2007-10-03, Dave <someplace@nowhere-nice.com> wrote: > >> If it's an important server in your company, it might be >> wise to replace them every 5 years. > > You jest. We have over 3000 Unix servers. Wild guesstimate, 12,000 disks. > Replace 2400 disks a year? Nonsense. Just let the machines age and you will be replacing that many at some point. |
| ||||
| Daniel Rock wrote: > In comp.sys.sun.admin Cydrome Leader <presence@mungepanix.com> wrote: >> just throw the drive out or RMA it. > > Why should I pay for it? > Rock on Daniel, I think you know more about disks than the others know about cars :-) /Jorgen |