Unix Technical Forum

Unicode stuff

This is a discussion on Unicode stuff within the Oracle Database forums, part of the Database Server Software category; --> Can anyone help me pick my way through this? I thought I had all the Unicode stuff more or ...


Go Back   Unix Technical Forum > Database Server Software > Oracle Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-23-2008, 10:47 AM
Howard J. Rogers
 
Posts: n/a
Default Unicode stuff

Can anyone help me pick my way through this? I thought I had all the Unicode
stuff more or less sussed out, but Windows is killing me.

Suppose I want to enter the word "größ" (that's g, r, o-umlaut, esset...
never quite sure whether newsreaders make sense of this stuff).

Using the Windows Character Map, I find the following Unicode encodings for
that:

g=0067
r=0072
ö=00F6
ß=00DF

I therefore try:

select unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
from dual;

And get this result:

UNIS
----
gr÷¯

(g, r, the division symbol and something weird)

Obviously I'm doing my select in SQL*Plus, so font display could be an
issue. But I can't see why, because I can do this:

select 'größ' from dual; and it works just fine:

SQL> select 'größ' from dual;

'GRö-'
-------
größ

(Although the column heading is a bit suspect, the characters in the string
are displaying fine, so I don't see that it can be a font/display issue).

I suspect a mismatch between Windows' Unicode encoding and Oracle's (I'm
using AL32UTF8 as my database character set when I'm doing this. I thought
Windows was UTF8, too, but maybe not. Besides, I get the same sort of mess
in Linux). If that's the case (and I'm open to other offers) where do I find
out what Unicode codes to supply to get the required characters? If not
Windows Character Map, where? Or is my methodology entirely dodgy from the
word go?

All answers gratefully received, though the KISS principle wins plaudits.

I would also be interested in receiving any suggestions for interesting
(possibly mildly racey) foreign words I could impress (but not insult)
students with if I ever get the chance! The German for a motorway speed
limit sign is too long to type in a hurry, however. Pithy is better. Funny
is better still.

Regards
HJR


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-23-2008, 10:47 AM
Michel Cadot
 
Posts: n/a
Default Re: Unicode stuff


"Howard J. Rogers" <hjr@dizwell.com> a écrit dans le message de
news:40bc1ea1$0$8987$afc38c87@news.optusnet.com.au ...
> Can anyone help me pick my way through this? I thought I had all the Unicode
> stuff more or less sussed out, but Windows is killing me.
>
> Suppose I want to enter the word "größ" (that's g, r, o-umlaut, esset...
> never quite sure whether newsreaders make sense of this stuff).
>
> Using the Windows Character Map, I find the following Unicode encodings for
> that:
>
> g=0067
> r=0072
> ö=00F6
> ß=00DF
>
> I therefore try:
>
> select unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
> from dual;
>
> And get this result:
>
> UNIS
> ----
> gr÷¯
>
> (g, r, the division symbol and something weird)
>
> Obviously I'm doing my select in SQL*Plus, so font display could be an
> issue. But I can't see why, because I can do this:
>
> select 'größ' from dual; and it works just fine:
>
> SQL> select 'größ' from dual;
>
> 'GRö-'
> -------
> größ
>
> (Although the column heading is a bit suspect, the characters in the string
> are displaying fine, so I don't see that it can be a font/display issue).
>
> I suspect a mismatch between Windows' Unicode encoding and Oracle's (I'm
> using AL32UTF8 as my database character set when I'm doing this. I thought
> Windows was UTF8, too, but maybe not. Besides, I get the same sort of mess
> in Linux). If that's the case (and I'm open to other offers) where do I find
> out what Unicode codes to supply to get the required characters? If not
> Windows Character Map, where? Or is my methodology entirely dodgy from the
> word go?
>
> All answers gratefully received, though the KISS principle wins plaudits.
>
> I would also be interested in receiving any suggestions for interesting
> (possibly mildly racey) foreign words I could impress (but not insult)
> students with if I ever get the chance! The German for a motorway speed
> limit sign is too long to type in a hurry, however. Pithy is better. Funny
> is better still.
>
> Regards
> HJR
>


It works for me:

SQL> select unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
2 from dual;

UNIS
----
größ

1 row selected.

It's an Oracle 9.2.0.4 on French WinNT4 SP6 (code page 1252).
DB charset is WE8MSWIN1252 and ncharset UTF8.

Ifaik, Windows uses UCS2 and not UTF8.

--
Regards
Michel Cadot


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-23-2008, 10:47 AM
Howard J. Rogers
 
Posts: n/a
Default Re: Unicode stuff


"Michel Cadot" <micadot{at}altern{dot}org> wrote in message
news:40bc3f76$0$12086$626a14ce@news.free.fr...

> It works for me:
>
> SQL> select

unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
> 2 from dual;
>
> UNIS
> ----
> größ
>
> 1 row selected.
>
> It's an Oracle 9.2.0.4 on French WinNT4 SP6 (code page 1252).
> DB charset is WE8MSWIN1252 and ncharset UTF8.
>
> Ifaik, Windows uses UCS2 and not UTF8.



Right.... that's nice for you. But it doesn't exactly help me a great deal,
does it?

I have tried it with precisely that combination of character sets, and no
joy. So do I have to apply for a French passport for it to work, or what?

I am unaware of anything to do with settings for code pages in Win2KAS, but
I'm sure there must be. Somewhere. But that doesn't explain Linux not
playing ball either. But leave that out of it for the moment. Do I have to
do something to Windows to make it display properly, or what? Do you, or
indeed anyone else, have anything perhaps a tad more concrete by way of
suggestions for a fix?

But thank you for taking the time to check my original syntax was fine. And
also proving that I wasn't completely loopy using Character Map to determine
the Unicode values. That was very valuable indeed.

Regards
HJR


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-23-2008, 10:47 AM
Chris O
 
Posts: n/a
Default Re: Unicode stuff

"Howard J. Rogers" <hjr@dizwell.com> wrote in message
news:40bc54a0$0$3035$afc38c87@news.optusnet.com.au ...
>
> "Michel Cadot" <micadot{at}altern{dot}org> wrote in message
> news:40bc3f76$0$12086$626a14ce@news.free.fr...
>
> > It works for me:
> >
> > SQL> select

> unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
> > 2 from dual;
> >
> > UNIS
> > ----
> > größ
> >
> > 1 row selected.
> >
> > It's an Oracle 9.2.0.4 on French WinNT4 SP6 (code page 1252).
> > DB charset is WE8MSWIN1252 and ncharset UTF8.
> >
> > Ifaik, Windows uses UCS2 and not UTF8.

>
>
> Right.... that's nice for you. But it doesn't exactly help me a great

deal,
> does it?
>
> I have tried it with precisely that combination of character sets, and no
> joy. So do I have to apply for a French passport for it to work, or what?
>
> I am unaware of anything to do with settings for code pages in Win2KAS,

but
> I'm sure there must be. Somewhere. But that doesn't explain Linux not
> playing ball either. But leave that out of it for the moment. Do I have to
> do something to Windows to make it display properly, or what? Do you, or
> indeed anyone else, have anything perhaps a tad more concrete by way of
> suggestions for a fix?
>
> But thank you for taking the time to check my original syntax was fine.

And
> also proving that I wasn't completely loopy using Character Map to

determine
> the Unicode values. That was very valuable indeed.
>
> Regards
> HJR
>
>

Hi Howard.

I'll have a stab at it.

What is NLS_LANG set to? Hopefully not to AL32UTF8? If so then your problem
is because you have told Oracle "not to perform any character-set
conversions" for you. Consequently all UTF-8 chars would be passed thru "as
is" which is going to confuse the hell out of SQL*Plus which would be
expecting something in a Windows character set and not UTF-8 sequences.

Cheers Chris

PS. Go easy on the French.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-23-2008, 10:47 AM
VC
 
Posts: n/a
Default Re: Unicode stuff

Hello Howard,

"Howard J. Rogers" <hjr@dizwell.com> wrote in message
news:40bc54a0$0$3035$afc38c87@news.optusnet.com.au ...
>
> "Michel Cadot" <micadot{at}altern{dot}org> wrote in message
> news:40bc3f76$0$12086$626a14ce@news.free.fr...
>
> > It works for me:
> >
> > SQL> select

> unistr('\0067')||unistr('\0072')||unistr('\00F6')| |unistr('\00DF')
> > 2 from dual;
> >
> > UNIS
> > ----
> > größ
> >
> > 1 row selected.
> >
> > It's an Oracle 9.2.0.4 on French WinNT4 SP6 (code page 1252).
> > DB charset is WE8MSWIN1252 and ncharset UTF8.
> >
> > Ifaik, Windows uses UCS2 and not UTF8.

>
>
> Right.... that's nice for you. But it doesn't exactly help me a great

deal,
> does it?
>
> I have tried it with precisely that combination of character sets, and no
> joy. So do I have to apply for a French passport for it to work, or what?
>
> I am unaware of anything to do with settings for code pages in Win2KAS,

but
> I'm sure there must be. Somewhere. But that doesn't explain Linux not
> playing ball either. But leave that out of it for the moment. Do I have to
> do something to Windows to make it display properly, or what? Do you, or
> indeed anyone else, have anything perhaps a tad more concrete by way of
> suggestions for a fix?
>
> But thank you for taking the time to check my original syntax was fine.

And
> also proving that I wasn't completely loopy using Character Map to

determine
> the Unicode values. That was very valuable indeed.
>
> Regards
> HJR


I assume you are using Windows 2k/XP.

The GUI sqlplus (sqlplusw) should work without any changes as its code page
already includes 'pictures' for the Unicode characters.

The console sqlplus version needs some tweaks:

1. Change the code page to 1252:

chcp 1252

2. Change the font to "Lucida Console" (The console window 'Properties'
menu).

3. Run your sql


Regards.

VC
>
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-23-2008, 10:47 AM
Michel Cadot
 
Posts: n/a
Default Re: Unicode stuff


"Howard J. Rogers" <hjr@dizwell.com> a écrit dans le message de
news:40bc54a0$0$3035$afc38c87@news.optusnet.com.au ...
>
> Right.... that's nice for you. But it doesn't exactly help me a great deal,
> does it?
>
> I have tried it with precisely that combination of character sets, and no
> joy. So do I have to apply for a French passport for it to work, or what?
>
> I am unaware of anything to do with settings for code pages in Win2KAS, but
> I'm sure there must be. Somewhere. But that doesn't explain Linux not
> playing ball either. But leave that out of it for the moment. Do I have to
> do something to Windows to make it display properly, or what? Do you, or
> indeed anyone else, have anything perhaps a tad more concrete by way of
> suggestions for a fix?
>
> But thank you for taking the time to check my original syntax was fine. And
> also proving that I wasn't completely loopy using Character Map to determine
> the Unicode values. That was very valuable indeed.
>
> Regards
> HJR
>
>


It seems to me you're a little bit sarcastics but my english is too basic to understand
the finesse of your message.
The purpose of my post was just to say there is nothing inherent in Windows that made
you example didn't work and there is hope to make it work in your environment.
I specified i use WinNp4 and French to point to the difference between the configuration
i use and yours and let you know some directions to investigate but it is out of my skills
to say you in what. Just gave you some facts.
In summary, i tried to help you the best i can even if it is not much.

--
Regards
Michel Cadot


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-23-2008, 10:47 AM
Paul Moore
 
Posts: n/a
Default Re: Unicode stuff

"VC" <boston103@hotmail.com> writes:

> I assume you are using Windows 2k/XP.
>
> The GUI sqlplus (sqlplusw) should work without any changes as its code page
> already includes 'pictures' for the Unicode characters.
>
> The console sqlplus version needs some tweaks:
>
> 1. Change the code page to 1252:
>
> chcp 1252


I can confirm that this is the change that does the trick. I get the
same results as you (chcp shows my codepage is 437 by default).
Changing the codepage to 1252 works fine.

To check your codepage, type "chcp" into the DOS prompt, and it will
tell you what your current codepage is.

Paul.

PS I agree - you were a bit harsh on Michel Cadot. "It works for me"
can often be useful input - in this case, he specifically mentioned
he is using codepage 1252, it's just that you needed the additional
help to explain how to check your codepage and change it as
needed...
--
This signature intentionally left blank
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-23-2008, 10:48 AM
Howard J. Rogers
 
Posts: n/a
Default Re: Unicode stuff


"Michel Cadot" <micadot{at}altern{dot}org> wrote in message
news:40bc7b46$0$12112$626a14ce@news.free.fr...
>
> "Howard J. Rogers" <hjr@dizwell.com> a écrit dans le message de
> news:40bc54a0$0$3035$afc38c87@news.optusnet.com.au ...
> >
> > Right.... that's nice for you. But it doesn't exactly help me a great

deal,
> > does it?
> >
> > I have tried it with precisely that combination of character sets, and

no
> > joy. So do I have to apply for a French passport for it to work, or

what?
> >
> > I am unaware of anything to do with settings for code pages in Win2KAS,

but
> > I'm sure there must be. Somewhere. But that doesn't explain Linux not
> > playing ball either. But leave that out of it for the moment. Do I have

to
> > do something to Windows to make it display properly, or what? Do you, or
> > indeed anyone else, have anything perhaps a tad more concrete by way of
> > suggestions for a fix?
> >
> > But thank you for taking the time to check my original syntax was fine.

And
> > also proving that I wasn't completely loopy using Character Map to

determine
> > the Unicode values. That was very valuable indeed.
> >
> > Regards
> > HJR
> >
> >

>
> It seems to me you're a little bit sarcastics but my english is too basic

to understand
> the finesse of your message.


It is quite possible that some sarcasm spilled over into my reply to you,
and in that case I apologise, although the comment about a French passport
was supposed to be humour not sarcasm. But, the last paragraph was from the
heart and meant well. I had no confidence that the 'pull codes from Windows
Character Map' was a valid method, and you confirmed that it was. And I had
lost all confidence about the syntax used to Unicode characters in the first
place, and you confirmed I had the syntax right. Those were two important
things I needed to know, so as I said "that was very valuable indeed" and I
meant it.

> The purpose of my post was just to say there is nothing inherent in

Windows that made
> you example didn't work and there is hope to make it work in your

environment.

And it was *that* which I said was very valuable, and I meant it.

> I specified i use WinNp4 and French to point to the difference between the

configuration
> i use and yours and let you know some directions to investigate but it is

out of my skills
> to say you in what. Just gave you some facts.


And I said the facts (that Character Map codes work, that my syntax was
fine) were very valuable, and I meant it. But I also needed something rather
more concrete to go on, or 'practical' if you prefer, as you point out.

> In summary, i tried to help you the best i can even if it is not much.


And, in summary, as far as you went was very valuable; I said it; and I
meant it.

I can only say it was very valuable indeed again, and thank you for it.

Regards
HJR




>
> --
> Regards
> Michel Cadot
>
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-23-2008, 10:48 AM
Howard J. Rogers
 
Posts: n/a
Default Re: Unicode stuff


"VC" <boston103@hotmail.com> wrote in message
news:W9_uc.30724$eY2.28332@attbi_s02...
> I assume you are using Windows 2k/XP.
>
> The GUI sqlplus (sqlplusw) should work without any changes as its code

page
> already includes 'pictures' for the Unicode characters.
>
> The console sqlplus version needs some tweaks:
>
> 1. Change the code page to 1252:
>
> chcp 1252



Where do you find this stuff? I've been administering Windows for many
years, and I can swear on a stack of Bibles that not once have I ever typed
such a command or known that it could be done!!!

All I can say is 'thank you very much' because it works 100% and I am now a
happy man.

Now... do you know what the equivalent on Linux would be?? :-)

Regards
HJR


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-23-2008, 10:48 AM
Howard J. Rogers
 
Posts: n/a
Default Re: Unicode stuff


"Paul Moore" <pf_moore@yahoo.co.uk> wrote in message
news:hdtvdz06.fsf@yahoo.co.uk...

> To check your codepage, type "chcp" into the DOS prompt, and it will
> tell you what your current codepage is.
>
> Paul.


Well, that's a neat trick too. So thank you.

> PS I agree - you were a bit harsh on Michel Cadot. "It works for me"
> can often be useful input - in this case, he specifically mentioned
> he is using codepage 1252, it's just that you needed the additional
> help to explain how to check your codepage and change it as
> needed...


Well, as I've said elsewhere: he told me the Character Map technique was
fine, and my syntax was fine. And I thanked him for that, and told him that
knowing those things was very valuable to me. Which bit of that was 'harsh'?
It is possible I grew up on too much Blackadder as a child, and sarcasm
drips into everything I write, but if so it wasn't intended as such. The
only line in my post which I recall being a bit peeved as I wrote it was the
very first one. After I wrote it, I made a cup of tea, came back and
finished the post off with what I hoped was a joke about the French
passport, and wrote nothing (I thought) very exceptionable after that.

If you, or indeed Michel, or indeed anyone else has taken it another way,
then I can only assure you: it wasn't written in that frame of mind, and
wasn't intended to be read that way, with the acknowledged exception of the
first line, which I should have cut.

Regards
HJR









Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:03 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com