View Single Post

   
  #7 (permalink)  
Old 04-11-2008, 06:26 AM
Andrej Ricnik-Bay
 
Posts: n/a
Default Re: Differences in UTF8 between 8.0 and 8.1

> does strip out the invalid characters. However, iconv reads the
> entire file into memory before it writes out any data. This is not so
> good for multi-gigabyte dump files and doesn't allow for it to be used
> in a pipe between pg_dump and psql.
>
> Anyone have any other recommendations? GNU recode might do it, but
> I'm a bit stymied by the syntax. A quick perl script using
> Text::Iconv didn't work either. I'm off to look at some other perl
> modules and will try to create a script so I can strip out the invalid
> characters.

How about an ugly kludge ...

split -a 3 -d -b 1048576 ../path/to/dumpfile dumpfile
for i in `ls -1 dumpfile*`; do iconv -c -f UTF8 -t UTF8 $i;done
cat dumpfile* > new_dump


Cheers,
Andrej

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Reply With Quote