This is a discussion on Re: FATAL: could not open relation xxx: No such file or directory within the pgsql Admins forums, part of the PostgreSQL category; --> On Tue, Apr 22, 2008 at 12:02 PM, Michael Monnerie < michael.monnerie@it-management.at > wrote: > What I had twice ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Tue, Apr 22, 2008 at 12:02 PM, Michael Monnerie < michael.monnerie@it-management.at> wrote: > What I had twice (on different customers, once SCSI once SATA) is that a > broken hard disk reports no errors, but delivers different data than > what was written before. Very nasty, as the RAID controller doesn't see > any problem, and destroys even the good harddisks data after the next > write, because the read data is already broken. > How have you recognized such a hard disk? Regards Mikko |
| |||
| On Donnerstag, 17. April 2008 Mikko Partio wrote: > I run fsck on the filesystem (gfs) -- no problems found. The disks > are from a san and the diagnostic programs say there's nothing wrong. > I also have other db clusters running on different filesystems (also > gfs) and I have never had any problems with them. A bit OT, but maybe related: I have similar strangeness with a Linux box with Areca controller. On this box, the reiserfs filesystem starts getting seriously damaged after some time. Memtest showed no problems, and everything looks fine. Today we will replace the mainboard, it could have an internal problem (transport from memory to controller broken?). What I had twice (on different customers, once SCSI once SATA) is that a broken hard disk reports no errors, but delivers different data than what was written before. Very nasty, as the RAID controller doesn't see any problem, and destroys even the good harddisks data after the next write, because the read data is already broken. HTH, good luck. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) iD8DBQBIDanDzhSR9xwSCbQRAhOKAKC5vdFfUOa9yMfx89U37V m8N82sQgCfWDmt I2sqNyUg0rqdG4Nk8Et9UGg= =kh3P -----END PGP SIGNATURE----- |
| ||||
| On Dienstag, 22. April 2008 Mikko Partio wrote: > How have you recognized such a hard disk? With "badblocks", which writes some patterns and re-reads it. But it's of course annoying slow. At these servers I was lucky. Both were "only" 73GB disks used in a RAID-1, so only 2 small drives to check. With a RAID of 8x750GB disks, it will take a *long* time to check, if you cannot simply replace all disks at once. At this customer from today, I would have to take one drive, check it, replace it, let RAID rebuild CRC, take the next... a new mainboard is less work, so I try this first. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0676/846 914 666 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: www.keyserver.net Key-ID: 1C1209B4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) iD8DBQBIDb2TzhSR9xwSCbQRAnuaAKDxwA+rQyG9pdkTnkaeAb qTNT9OFwCdHHyV m74nSXGO/zH+8qzNoeSbu+Q= =VWsA -----END PGP SIGNATURE----- |