This is a discussion on disk bad block checking post-install / ext2 within the Linux Operating System forums, part of the Unix Operating Systems category; --> Hmm, does a utility exist to check a (SCSI) disk for bad blocks after installation? It seems that badblocks ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hmm, does a utility exist to check a (SCSI) disk for bad blocks after installation? It seems that badblocks is designed to run at filesystem creation time only (and really designed to be called by the relevant mkfs utility). One of the 18GB SCSI drives in my home fileserver (ext2 filesystem) is starting to throw up the occasional disk error on the OS drive, and although I have a spare I can use I'd rather check other options first... So, a) does anything exist to run a disk check whilst maintaining filesystem integrity, b) am I sort-of worrying about nothing because the SCSI firmware in the drive itself is remapping bad blocks as soon as they're detected anyway? Of course seeing any disk errors seems cause for concern so I should probably replace the drive soon - just a question of whether to do it ASAP or when I actually have a free moment! (Yes I do have regular backups before anyone asks cheers Jules |
| |||
| Jules wrote: > Hmm, > > does a utility exist to check a (SCSI) disk for bad blocks after > installation? > > It seems that badblocks is designed to run at filesystem creation time > only (and really designed to be called by the relevant mkfs utility). I do not believe this is the case. You _must_ read the man page though as it can otherwise overwrite the drive. > > One of the 18GB SCSI drives in my home fileserver (ext2 filesystem) is > starting to throw up the occasional disk error on the OS drive, and > although I have a spare I can use I'd rather check other options > first... > > So, > > a) does anything exist to run a disk check whilst maintaining > filesystem integrity, Read the badblocks manual page. > > b) am I sort-of worrying about nothing because the SCSI firmware in the > drive itself is remapping bad blocks as soon as they're detected > anyway? Yes, but if the drive is starting to fail, you will soon use up all the spare blocks. > > Of course seeing any disk errors seems cause for concern so I should > probably replace the drive soon - just a question of whether to do it > ASAP or when I actually have a free moment! > > (Yes I do have regular backups before anyone asks > One problem I have with badblocks is that it detects bad blocks on my brand new hard drives, so it is not much use to me anymore. These are Maxtor 6Y080P0 EIDE drives and Maxtor KU018L2 SCSI drives. I cannot believe Maxtor would ship out 4 of each in a row with bad blocks. Since these drives have been working perfectly well on several machines for a year (and some of them almost two years), one possiblilty is that badblocks turns off the bad block mapping in the drive itself for testing purposes. I am not sure this is even possible, though. I do not really know what is going on there. -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 09:40:00 up 10 days, 22:59, 3 users, load average: 4.16, 4.06, 4.03 |
| |||
| In message <pan.2005.01.11.13.55.47.38058@remove.this.yahoo.c o.uk>, Jules <julesrichardsonuk@remove.this.yahoo.co.uk> writes >Hmm, > >does a utility exist to check a (SCSI) disk for bad blocks after >installation? > >It seems that badblocks is designed to run at filesystem creation time >only (and really designed to be called by the relevant mkfs utility). > >One of the 18GB SCSI drives in my home fileserver (ext2 filesystem) is >starting to throw up the occasional disk error on the OS drive, and >although I have a spare I can use I'd rather check other options first... > >So, > >a) does anything exist to run a disk check whilst maintaining filesystem >integrity, All good things come to an end. This includes disk drives. If it's started to error, it's certainly time to take a concerned interest ! >b) am I sort-of worrying about nothing because the SCSI firmware in the >drive itself is remapping bad blocks as soon as they're detected anyway? There are only so many blocks it can remap. Be afraid, be very afraid. >Of course seeing any disk errors seems cause for concern so I should >probably replace the drive soon - just a question of whether to do it ASAP >or when I actually have a free moment! It's a probability thing. Are you feeling lucky ? >(Yes I do have regular backups before anyone asks As you have backups, perhaps you do have a free moment, but make sure they don't get progressively corrupted as the disk goes bad. What does down time cost you ? If the system is in any kind of professional use I'd replace the drive as soon as I drove past somewhere that sold decent drives ! Cheers, J/. -- John Beardmore |
| |||
| Jean-David Beyer wrote: > One problem I have with badblocks is that it detects bad blocks on my > brand new hard drives, so it is not much use to me anymore. These are > Maxtor 6Y080P0 EIDE drives and Maxtor KU018L2 SCSI drives. I cannot > believe Maxtor would ship out 4 of each in a row with bad blocks. ..... hard driver are NOTORIOUSLY riddled with badblocks. cripes, just check the table on a new drive! start here: http://grc.com/sroverview.htm and rummage through the site to increase your hard drive knowledge [...] Thus, from the perspective of the manufacturer, putting more reliability into their drives is wasted money, since no one will buy their drives for that reason. If one drive costs 20% more than another, say $239 instead of $199, and the drives are the same size and seem identical, wouldn't everyone save the $40 and happily take home a new drive for $199? Of course. That's why, when you're in the business of making hard drives the first thing you learn is that ... Reliability Isn't Profitable! [...] -- << http://michaeljtobler.homelinux.com/ >> I used to be an agnostic, but now I'm not so sure. |
| |||
| Jules wrote: > Hmm, > > does a utility exist to check a (SCSI) disk for bad blocks after > installation? > > It seems that badblocks is designed to run at filesystem creation time > only (and really designed to be called by the relevant mkfs utility). > > One of the 18GB SCSI drives in my home fileserver (ext2 filesystem) is > starting to throw up the occasional disk error on the OS drive, and > although I have a spare I can use I'd rather check other options first... > > So, > > a) does anything exist to run a disk check whilst maintaining filesystem > integrity, As others noted, $ man badblocks -- _very_ carefully. Also, check if you have smartmontools on your distro. $ man smartd Check if you can get a disk diagnostic/repair utility from the maker: http://www.duxcw.com/faq/hd/diag.htm > b) am I sort-of worrying about nothing because the SCSI firmware in the > drive itself is remapping bad blocks as soon as they're detected anyway? As noted, you only have so much spare disk space for remapping. And the re-mapping is only available on first write -- it the hd can't _read_ the block with data it will _not_ re-map. > Of course seeing any disk errors seems cause for concern so I should > probably replace the drive soon - just a question of whether to do it ASAP > or when I actually have a free moment! > > (Yes I do have regular backups before anyone asks (Keep them very handy!) Badblocks is best run at fs creation time, but is handy afterwards also. But to use it you must _know_ what you're doing _and_ generate a table that the fs can read/use to avoid badblock marked blocks. The hd badblock re-mapping _should_ be transparent, but if the hd can't _read_ a block, it won't re-map it unless you force it (and likely lose data). For smartmontools visit: http://smartmontools.sourceforge.net/ The more you use the flakey disks, the more likely you will lose data. good luck, prg email above disabled |
| |||
| mjt wrote: > Jean-David Beyer wrote: > > >>One problem I have with badblocks is that it detects bad blocks on my >>brand new hard drives, so it is not much use to me anymore. These are >>Maxtor 6Y080P0 EIDE drives and Maxtor KU018L2 SCSI drives. I cannot >>believe Maxtor would ship out 4 of each in a row with bad blocks. > > > > .... hard drive[s] are NOTORIOUSLY riddled with badblocks. > cripes, just check the table on a new drive! start here: > http://grc.com/sroverview.htm and rummage through the > site to increase your hard drive knowledge I do not doubt that. What I am wondering is about the badblocks program. If it reads and writes blocks (four different ways), does it do one (or a bunch of) block at a time? And if so, does it get the blocks in the order presented by the disk controller, which should be _after_ the mapping of bad block IDs to good block IDs has already been done. I.e., should not a hard drive test good with badblocks until there are no spare blocks left on the drive for the drive to use? My hard drives were all new last March (2004), and they failed badblocks at the time. They do not seem to be failing me even though I run this machine 24/7 and reboot only when I get a new kernel, or make an addition to a partition table, or, most recently, when disaster struck. > > [...] > Thus, from the perspective of the manufacturer, putting more > reliability into their drives is wasted money, since no one will > buy their drives for that reason. I suppose that is true for retail customers. But people making servers generally would, I imagine, be willing to pay more for a better MTBF and MTTR, don't you think. $50 more for a drive guaranteed to work for 5 years instead of one? People put in SCSI drives when Ultra/133 EIDE or SATA drives would do as well just to get the greater reliability that SCSI drives are thought to provide (I am not saying that they do provide it, just commenting on the perception). > If one drive costs 20% more > than another, say $239 instead of $199, and the drives are the > same size and seem identical, wouldn't everyone save the $40 and > happily take home a new drive for $199? Of course. Not "of course." Just the majority. When I talked with Maxtor about this, they said to order certain models of their drives instead of others because they had longer warranties, and the longer warranties were given because they perceived the drives were more reliable and the longer warranty would not cost them much. > > That's why, when you're in the business of making hard drives > the first thing you learn is that ... > Reliability Isn't Profitable! > [...] -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 15:00:00 up 11 days, 4:19, 3 users, load average: 4.15, 4.09, 4.07 |
| |||
| Jean-David Beyer wrote: > I suppose that is true for retail customers. But people making servers > generally would, I imagine, be willing to pay more for a better MTBF and > MTTR, don't you think. $50 more for a drive guaranteed to work for 5 years > instead of one? .... i dont think so. and the diff wouldnt be $50, it would be a much higher number. and i would say it's difficult to truly guarantee a MTBF, since you never know how a drive is being maintained. a drive will fail sooner if maintenance is not performed regularly. -- << http://michaeljtobler.homelinux.com/ >> If you sit down at a poker game and don't see a sucker, get up. You're the sucker. |
| |||
| mjt wrote: > Jean-David Beyer wrote: > > >>I suppose that is true for retail customers. But people making servers >>generally would, I imagine, be willing to pay more for a better MTBF and >>MTTR, don't you think. $50 more for a drive guaranteed to work for 5 years >>instead of one? > > > > ... i dont think so. and the diff wouldnt be $50, it would > be a much higher number. and i would say it's difficult to > truly guarantee a MTBF, since you never know how a drive is > being maintained. a drive will fail sooner if maintenance > is not performed regularly. I change the oil and filter every 3 months, and the spark plugs twice a year? Does not everyone maintain their hard drives this way? But seriously, what maintenance are you supposed to do to a hard drive? Manufacturers are pretty emphatic about not opening the drives up. There are no oil cups to fill, no air filters to change, etc. My SCSI hard drives have two fans blowing cool air into the compartment where they are and two exhaust fans sucking the warm air out. Here is one of them: not too hot. # /usr/sbin/smartctl -a /dev/sda smartctl version 5.1-11 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: MAXTOR ATLASU320_18_WLS Version: B120 Serial number: 342218152178 Device type: disk Local Time is: Tue Jan 11 15:41:34 2005 EST Device supports SMART and is Enabled Temperature Warning Enabled SMART Sense: Ok! Current Drive Temperature: 37 C <---<<< Manufactured in week 25 of year 2002 Current start stop count: 46 times Recommended start stop count: 4294967295 times Error counter log: Errors Corrected Total Total Correction Gigabytes Total delay: [rereads/ errors algorithm processed uncorrec minor | major rewrites] corrected invocations [10^9 bytes] errors read: 2149 0 0 0 0 8.253 0 write: 0 0 0 0 0 4.081 0 Non-medium error count: 26 No self-tests have been logged Long (extended) Self Test duration: 672 seconds [11.2 minutes] -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 15:35:00 up 11 days, 4:54, 3 users, load average: 4.26, 4.02, 3.99 |
| |||
| Jean-David Beyer wrote: > But seriously, what maintenance are you supposed to do to a hard drive? .... it's probably not AS critical as it used to be. alot depends on the environment, ie, protecting the item containing the drive. most maintenance is reactive in nature. IOW, when something is REPORTED as bad, then we tend to do maintenance. personally, i've been using the spinrite product for my maintenance chores in a proactive stance, which means i do maintenance BEFORE i'm informed of something amiss for you, maybe it means running this on occassion: http://www.maxtor.com/portal/site/Ma...&downloadID=27 -- << http://michaeljtobler.homelinux.com/ >> Mathematicians take it to the limit. |
| ||||
| On Tue, 11 Jan 2005 15:07:35 -0500, Jean-David Beyer wrote: > My hard drives were all new last March (2004), and they failed badblocks > at the time. They do not seem to be failing me even though I run this > machine 24/7 and reboot only when I get a new kernel, or make an addition > to a partition table, or, most recently, when disaster struck. In my case, when I created the filesystem on this drive I did an exhaustive (destructive) check via mkfs - and had no errors thrown up. (I *always* do such a test on any drive I get, whether new or not). The drive which is reporting problems now has run continously for maybe 6 months or so - however it's only thrown up problems in the last few days. I can't be *certain* of its history; it was in a corporate server from new so I expect it's just been run 24x7 throughout its life with no nasty start-stops. Funnily enough we had a storm knock out power just before I started seeing problems; seems unlikely that some kind of surge would kill the drive and nothing else, but maybe it's possible... I'm going to check cabling just in case too - in the last day I've seen some "parity error detected in Data-in phase" errors in dmesg output. I'm 99.9% convinced it's the drive of course, but I have seen cable faults before (on a different machine) which have been horribly temperature-related and took some tracking down. >> [...] >> Thus, from the perspective of the manufacturer, putting more >> reliability into their drives is wasted money, since no one will buy >> their drives for that reason. > > I suppose that is true for retail customers. But people making servers > generally would, I imagine, be willing to pay more for a better MTBF and > MTTR, don't you think. $50 more for a drive guaranteed to work for 5 > years instead of one? People put in SCSI drives when Ultra/133 EIDE or > SATA drives would do as well just to get the greater reliability that > SCSI drives are thought to provide (I am not saying that they do provide > it, just commenting on the perception). Personal experience: Historically I've nearly always paid more and put SCSI disks in machines I've built for that reason, and it *seems* to have always paid off - I've had IDE disks fail on me before but this is the first SCSI disk I've had trouble with out of maybe 50 drives (which have eventually been retired due to size constraints rather than faults) Plus I like the greater flexibility of SCSI (the DAT drive, CD writer etc. for my desktop live in Sun external boxes on top of the desk, whilst the rest of the machine's nicely tucked away below) I seem to be in the minority though; I prefer to do a job right the first time even if it's more expensive up-front, rather that do it on the cheap (and end up paying more in the long run that way!) cheers Jules |