Unix Technical Forum

8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding

This is a discussion on 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding within the pgsql Bugs forums, part of the PostgreSQL category; --> Hi, all ! I can't convert with convert(bytea, name, name)::bytea from 'iso-8859-5' to 'windows-1251' or any other cyrillic 8-bit ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Bugs

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 12:15 PM
Sergey Burladyan
 
Posts: n/a
Default 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding

Hi, all !

I can't convert with convert(bytea, name, name)::bytea from 'iso-8859-5'
to 'windows-1251' or any other cyrillic 8-bit encoding.

seb=> show client_encoding ;
client_encoding
-----------------
UTF8

seb=> show server_encoding;
server_encoding
-----------------
UTF8

seb=> select version();
version
----------------------------------------------------------------------------------------
PostgreSQL 8.3.0 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.3 (Debian
4.2.3-1)

lc_collate | ru_RU.UTF-8
lc_ctype | ru_RU.UTF-8
lc_messages | ru_RU.UTF-8
lc_monetary | ru_RU.UTF-8
lc_numeric | ru_RU.UTF-8
lc_time | ru_RU.UTF-8

seb=> select
convert(convert('абвгдеёжзийклмно рстуфхцчшщъыьэюяАБВГДЕЁЖ ИЙКЛМНОП*СТУФХЦЧШЩЪЫЬ*ЮЯ' , 'utf-8', 'iso-8859-5'), 'iso-8859-5', 'windows-1251');
ERROR: character 0xf1 of encoding "ISO_8859_5" has no equivalent
in "MULE_INTERNAL"

At first - i am convert my console locale encoding (ru_RU.UTF-8) to iso-8859-5
(cyrillic 8-bit character encoding) and second convert is for show problem.

windows-1251 - is other cyrillic 8-bit character encoding, convert to koi8-r
also not work.

i am write output of convert(..., 'utf-8', 'iso-8859-5') into file and readit
with: iconv -f iso-8859-5 -- all chars readed ok. (see progs in attach)

convert(..., 'iso-8859-5', 'utf-8') looking good, i am check it like this:
seb=> set standard_conforming_strings TO on; --- do not escape bytea
SET
seb=> select
convert('\320\321\322\323\324\325\361\326\327\330\ 331\332\333\334\335\336\337\340\341\342\343\344\34 5\346\347\350\351\352\353\354\355\356\357\260\261\ 262\263\264\265\241\266\267\270\271\272\273\274\27 5\276\277\300\301\302\303\304\305\306\307\310\311\ 312\313\314\315\316\317', 'iso-8859-5', 'utf-8');

convert
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

\320\260\320\261\320\262\320\263\320\264\320\265\3 21\221\320\266\320\267\320\270\320\271\320\272\320 \273\320\274\320\275\320\276\320\277\321\200\321\2 01\321\202\321\203\321\204\321\205\321\206\321\207 \321\210\321\211\321\212\321\213\321\214\321\215\3 21\216\321\217\320\220\320\221\320\222\320\223\320 \224\320\225\320\201\320\226\320\227\320\230\320\2 31\320\232\320\233\320\234\320\235\320\236\320\237 \320\240\320\241\320\242\320\243\320\244\320\245\3 20\246\320\247\320\250\320\251\320\252\320\253\320 \254\320\255\320\256\320\257
(1 запись)

seb=> set standard_conforming_strings TO off; --- now we must escaping bytea
for show text
SET
seb=> select
E'\320\260\320\261\320\262\320\263\320\264\320\265 \321\221\320\266\320\267\320\270\320\271\320\272\3 20\273\320\274\320\275\320\276\320\277\321\200\321 \201\321\202\321\203\321\204\321\205\321\206\321\2 07\321\210\321\211\321\212\321\213\321\214\321\215 \321\216\321\217\320\220\320\221\320\222\320\223\3 20\224\320\225\320\201\320\226\320\227\320\230\320 \231\320\232\320\233\320\234\320\235\320\236\320\2 37\320\240\320\241\320\242\320\243\320\244\320\245 \320\246\320\247\320\250\320\251\320\252\320\253\3 20\254\320\255\320\256\320\257';
?column?
--------------------------------------------------------------------
абвгдеёжзийклмнопрстуфхцч шщъыьэюяАБВГДЕЁЖЗИЙКЛМНОП *СТУФХЦЧШЩЪЫЬ*ЮЯ
(1 запись)

it os ok.

text string parameter is russian alphabet from first letter to last, lower
case, and from first letter to last, UPPER case

may be i am doing something wrong ?

---


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:29 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com