This is a discussion on pg_dump or hardware? within the pgsql Bugs forums, part of the PostgreSQL category; --> Hi all, short summary: platform: i386 SMP (dual PIII) os: linux 2.6.8.1 vendor: debian (3.1, stable) pgsql ver: 7.4.7 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi all, short summary: platform: i386 SMP (dual PIII) os: linux 2.6.8.1 vendor: debian (3.1, stable) pgsql ver: 7.4.7 (deb) disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H controller: adaptec aic-7892a description: we're experiencing a weird problem trying to get a dump of our db for backup purposes, the executed command is: /usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2 the output: pg_dump: ERROR: could not open relation with OID 201327173 pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed. pg_dump: Error message from server: ERROR: could not open relation with OID 201327173 pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov, prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato, cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout; so it seems that we've got some problems with the "nominativi" table (a 20 million-row table), in fact the following command also fails: pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 > nominativi.bz2 with the same err msg as before. Before the erros occurs we're are able to get a partial backup, see: #> ls -l nominativi.bz2 -rw-r--r-- 1 sickpig users 2.5M apr 19 12:35 nominativi.bz2 #> wc -l nominativi 145904 nominativi We're trying to understand whether this is due to data corruption or hardware failure. We run long self-tests on our SCSI disk through smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda" output. All suggestions are welcome. Regards, Andrea smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: IBM DDYS-T36950N Version: S96H Serial number: 5FFL3272 Device type: disk Transport protocol: Fibre channel (FCP-2) Local Time is: Wed Apr 19 13:14:01 2006 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 41 C Drive Trip Temperature: 85 C Manufactured in week 06 of year 2001 Current start stop count: 147 times Recommended maximum start stop count: 10000 times Error counter log: Errors Corrected Total Total Correction Gigabytes Total delay: [rereads/ errors algorithm processed uncorrected minor | major rewrites] corrected invocations [10^9 bytes] errors read: 0 0 0 5 5 6628.657 0 write: 0 0 0 0 0 4231.306 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background long Completed - 22800 - [- - -] # 2 Background long Completed - 22631 - [- - -] # 3 Background long Completed - 22463 - [- - -] # 4 Background long Completed - 22294 - [- - -] # 5 Background long Completed - 22126 - [- - -] # 6 Background long Completed - 21958 - [- - -] # 7 Background long Completed - 21789 - [- - -] # 8 Background long Completed - 21621 - [- - -] # 9 Background long Completed - 21452 - [- - -] #10 Background long Completed - 21284 - [- - -] #11 Background long Completed - 21115 - [- - -] #12 Background long Completed - 20947 - [- - -] #13 Background long Completed - 20801 - [- - -] #14 Background long Completed - 20633 - [- - -] #15 Background long Completed - 20464 - [- - -] #16 Background long Completed - 20296 - [- - -] #17 Background long Completed - 20127 - [- - -] #18 Background long Completed - 19959 - [- - -] #19 Background long Completed - 19790 - [- - -] #20 Background long Completed - 19622 - [- - -] Long (extended) Self Test duration: 1340 seconds [22.3 minutes] ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| andrea suisani <andrea.suisani@opinioni.net> writes: > pg_dump: ERROR: could not open relation with OID 201327173 > pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed. > pg_dump: Error message from server: ERROR: could not open relation with OID 201327173 Hmm ... what do you get from select oid from pg_class where relname = 'nominativi'; select relname from pg_class where oid = 201327173; If the first returns 201327173 while the second doesn't return anything, then I'd wonder about corruption of pg_class's OID index. REINDEXing it might help. I don't remember whether 7.4 makes you use a standalone backend to reindex system catalogs --- see its REINDEX man page for details. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| Thanks for the quick reply Tom Lane wrote: > andrea suisani <andrea.suisani@opinioni.net> writes: [cut] > Hmm ... what do you get from > > select oid from pg_class where relname = 'nominativi'; oid -------- 561644 (1 row) afaics it seems weird.... does this mean that another postgresql object screw up? this is the "\d nominativi" output: Table "public.nominativi" Column | Type | Modifiers ------------+------------------+----------- nome_cogno | character(70) | indirizzo | character(40) | cap | character(5) | citta | character(35) | prov | character(2) | prefisso | character(4) | telefono1 | character(13) | telefono2 | character(13) | note | character(100) | idpersona | integer | estrazione | date | num_estra | integer | occupato | boolean | cod_prov | integer | cod_com | integer | cod_reg | integer | capoluo | integer | rand | double precision | Indexes: "citta1_idx" btree (prov, citta) WHERE (((prov = 'BL'::bpchar) OR (prov = 'PD'::bpchar) OR (prov = 'RO'::bpchar) OR (prov = 'VE'::bpchar) OR (prov = 'TV'::bpchar) OR (prov = 'VR'::bpchar)) AND (cod_com IS NULL)) "cod_com_ndx1" btree (cod_com) "codprov_capo_rand1" btree (cod_prov, capoluo, rand) "pre_tel_index1" btree (prefisso, telefono1) and none of nominativi's indexes has an oid like 201327173 select oid from pg_class where relname in ('citta1_idx','cod_com_ndx1','codprov_capo_rand1', 'pre_tel_index1'); oid ---------- 46788374 40916657 40916656 40916658 (4 rows) > select relname from pg_class where oid = 201327173; relname --------- (0 rows) while if I "ask" for oid 561644 this is what I get: select relname from pg_class where oid = 561644; relname ------------ nominativi (1 row) > If the first returns 201327173 while the second doesn't return anything, > then I'd wonder about corruption of pg_class's OID index. REINDEXing it > might help. > I don't remember whether 7.4 makes you use a standalone > backend to reindex system catalogs --- see its REINDEX man page for > details. from what I can see from REINDEX man page, I can reindex shared system catalogs in stand-alone mode [cut] Regards, Andrea ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |