Unix Technical Forum

Problem with Unicode Strings

This is a discussion on Problem with Unicode Strings within the Oracle Miscellaneous forums, part of the Oracle Database category; --> Hi, I am having trouble with stuffing in an getting out unicode strings into/from my database. I will briefly ...


Go Back   Unix Technical Forum > Database Server Software > Oracle Database > Oracle Miscellaneous

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-15-2008, 08:33 PM
André Hartmann
 
Posts: n/a
Default Problem with Unicode Strings

Hi,
I am having trouble with stuffing in an getting out unicode strings
into/from my database. I will briefly describe what I am doing, hoping that
someone can point me to a mistake and end my weeks of trying...

* Oracle 9i client and server, connecting via OCI from a C++ (MS Dev Studio
2005) application
* Client OS: German Windows XP, SP2
* Server: Oracle9.2.0.1 Enterprise on Windows XP SP2, NLS_LANGUAGE=AMERICAN
* Server character set (determined via "select value from
nls_database_parameters where parameter='NLS_CHARACTERSET';"): WE8MSWIN1252
* Client character set GERMAN_GERMANY.WE8MSWIN1252 (determined via "sqlplus
/nolog" and @.[%NLS_LANG%])

What I am doing in my program to stuff unicode in is this:

std::wstring strUnicode(L"Âb穾ü??déñf??");//not correctly displayed here
because this message is not unicode
....
wchar_t* pValue = (pWS_ == NULL) ? NULL : _wcsdup( strUnicode.c_str() );
size_t nSize = (pValue == NULL) ? 0 : (sizeof(wchar_t) * (1 +
wcslen(pValue)));
....
swdReturnCode = OCIBindByPos (
(OCIStmt*) m_pOCIStatement,
(OCIBind**) &pBindHandle,
(OCIError*) m_pConnection->_getOCIError(),
(ub4) columnIndex_,
(dvoid*) pValue,
(sb4) nSize,
(ub2) SQLT_STR,
(dvoid*) NULL,
(ub2*) NULL,
(ub2*) NULL,
(ub4) NULL,
(ub4*) NULL,
(ub4) OCI_DEFAULT
);
ub2 csid = OCI_UTF16ID;
swdReturnCode = OCIAttrSet(
(void *) pBindHandle
, (ub4) OCI_HTYPE_BIND
, (void *) &csid
, (ub4) 0
, (ub4) OCI_ATTR_CHARSET_ID
, m_pConnection->_getOCIError()
);
....

The string that I am stuffing in corresponds to the following sequence of
integers in my program's main memory:

194,98,231,169,190,252,1046,950,100,233,241,102,17 15,1492

When I select I get the following back (displayed as "Âb穾ü¿¿déñf¿¿"):

194,98,231,169,190,252,191,191,100,233,241,102,191 ,191

So it turns out that some characters have been transformed to be
upsode-down question marks. I see the same thing when I dont fetch with my
application but with the Oracle Enterprose Manager.

What is wrong here? Am I missing out some important conversion on the way?
Thanks in advance,
André





Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-15-2008, 08:33 PM
Carlos
 
Posts: n/a
Default Re: Problem with Unicode Strings

On Apr 15, 3:04 pm, "André Hartmann" <andrehartm...@hotmail.com>
wrote:
> Hi,
> I am having trouble with stuffing in an getting out unicode strings
> into/from my database. I will briefly describe what I am doing, hoping that
> someone can point me to a mistake and end my weeks of trying...
>
> * Oracle 9i client and server, connecting via OCI from a C++ (MS Dev Studio
> 2005) application
> * Client OS: German Windows XP, SP2
> * Server: Oracle9.2.0.1 Enterprise on Windows XP SP2, NLS_LANGUAGE=AMERICAN
> * Server character set (determined via "select value from
> nls_database_parameters where parameter='NLS_CHARACTERSET';"): WE8MSWIN1252
> * Client character set GERMAN_GERMANY.WE8MSWIN1252 (determined via "sqlplus
> /nolog" and @.[%NLS_LANG%])
>
> What I am doing in my program to stuff unicode in is this:
>
> std::wstring strUnicode(L"Âb穾ü??déñf??");//not correctly displayed here
> because this message is not unicode
> ...
> wchar_t* pValue = (pWS_ == NULL) ? NULL : _wcsdup( strUnicode.c_str() );
> size_t nSize = (pValue == NULL) ? 0 : (sizeof(wchar_t) * (1 +
> wcslen(pValue)));
> ...
> swdReturnCode = OCIBindByPos (
> (OCIStmt*) m_pOCIStatement,
> (OCIBind**) &pBindHandle,
> (OCIError*) m_pConnection->_getOCIError(),
> (ub4) columnIndex_,
> (dvoid*) pValue,
> (sb4) nSize,
> (ub2) SQLT_STR,
> (dvoid*) NULL,
> (ub2*) NULL,
> (ub2*) NULL,
> (ub4) NULL,
> (ub4*) NULL,
> (ub4) OCI_DEFAULT
> );
> ub2 csid = OCI_UTF16ID;
> swdReturnCode = OCIAttrSet(
> (void *) pBindHandle
> , (ub4) OCI_HTYPE_BIND
> , (void *) &csid
> , (ub4) 0
> , (ub4) OCI_ATTR_CHARSET_ID
> , m_pConnection->_getOCIError()
> );
> ...
>
> The string that I am stuffing in corresponds to the following sequence of
> integers in my program's main memory:
>
> 194,98,231,169,190,252,1046,950,100,233,241,102,17 15,1492
>
> When I select I get the following back (displayed as "Âb穾ü¿¿déñf¿¿"):
>
> 194,98,231,169,190,252,191,191,100,233,241,102,191 ,191
>
> So it turns out that some characters have been transformed to be
> upsode-down question marks. I see the same thing when I dont fetch with my
> application but with the Oracle Enterprose Manager.
>
> What is wrong here? Am I missing out some important conversion on the way?
> Thanks in advance,
> André
>


I cannot see the point of storing UNICODE (UTF8? UTF16?) in a
WE8MSWIN1252 database... (you don't specify the NLSupport codepage for
NVARCHARS/NCHARS)

The DB codepage should be unicode (AL32UTF8).

Cheers.

Carlos.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-15-2008, 08:33 PM
André Hartmann
 
Posts: n/a
Default Re: Problem with Unicode Strings

"Carlos" <miotromailcarlos@netscape.net> schrieb im Newsbeitrag
news:c03f11e2-8cd4-41ff-bba6-b0a0f6c3c7e7@m3g2000hsc.googlegroups.com...
On Apr 15, 3:04 pm, "André Hartmann" <andrehartm...@hotmail.com>
wrote:
>I cannot see the point of storing UNICODE (UTF8? UTF16?) in a
>WE8MSWIN1252 database... (you don't specify the NLSupport codepage for
>NVARCHARS/NCHARS)
>
>The DB codepage should be unicode (AL32UTF8).
>

Hi, if that is so (unicode cannot be stored into databases that do not
have a unicode character set) then why can I create tables with unicode
columns (NCHAR, NVARCHAR2, NCLOB) in such databases? Wouldnt it be more
appropriate then for Oracle to cast errors when trying so? To say it the
other way round, the sheer fact that it is possible to declare unicode
columns in the database implied to me that it is possible to store such
values. Am I wrong here?

André



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-17-2008, 04:09 PM
Carlos
 
Posts: n/a
Default Re: Problem with Unicode Strings

On Apr 15, 5:26 pm, "André Hartmann" <andrehartm...@hotmail.com>
wrote:
> "Carlos" <miotromailcar...@netscape.net> schrieb im Newsbeitragnews:c03f11e2-8cd4-41ff-bba6-b0a0f6c3c7e7@m3g2000hsc.googlegroups.com...
> On Apr 15, 3:04 pm, "André Hartmann" <andrehartm...@hotmail.com>
> wrote:>I cannot see the point of storing UNICODE (UTF8? UTF16?) in a
> >WE8MSWIN1252 database... (you don't specify the NLSupport codepage for
> >NVARCHARS/NCHARS)

>
> >The DB codepage should be unicode (AL32UTF8).

>
> Hi, if that is so (unicode cannot be stored into databases that do not
> have a unicode character set) then why can I create tables with unicode
> columns (NCHAR, NVARCHAR2, NCLOB) in such databases? Wouldnt it be more
> appropriate then for Oracle to cast errors when trying so? To say it the
> other way round, the sheer fact that it is possible to declare unicode
> columns in the database implied to me that it is possible to store such
> values. Am I wrong here?
>
> André
>


If you are using NVARCHAR2/NCHAR columns you are probably using
AL16UTF16 for these ones.

Can you confirm?

Cheers.

Carlos.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-17-2008, 04:09 PM
joel garry
 
Posts: n/a
Default Re: Problem with Unicode Strings

On Apr 15, 8:26*am, "André Hartmann" <andrehartm...@hotmail.com>
wrote:
> "Carlos" <miotromailcar...@netscape.net> schrieb im Newsbeitragnews:c03f11e2-8cd4-41ff-bba6-b0a0f6c3c7e7@m3g2000hsc.googlegroups.com...
> On Apr 15, 3:04 pm, "André Hartmann" <andrehartm...@hotmail.com>
> wrote:>I cannot see the point of storing UNICODE (UTF8? UTF16?) in a
> >WE8MSWIN1252 database... (you don't specify the NLSupport codepage for
> >NVARCHARS/NCHARS)

>
> >The DB codepage should be unicode *(AL32UTF8).

>
> * Hi, if that is so (unicode cannot be stored into databases that do not
> have a unicode character set) then why can I create tables with unicode
> columns (NCHAR, NVARCHAR2, NCLOB) in such databases? Wouldnt it be more
> appropriate then for Oracle to cast errors when trying so? To say it the
> other way round, the sheer fact that it is possible to declare unicode
> columns in the database implied to me that it is possible to store such
> values. Am I wrong here?


You are missing the point about Oracle being helpful and friendly and
way, way helpful and way, way, way friendly about converting from one
character set to another. Most tools honor that, but some don't. But
the general way to not get the conversion is to have the proper NLS
environment, as well as the proper character set. In general, using a
character set that does not support what you are putting into it will
cause you grief one way or another. Unicode is designed to handle any
sets (subject to which Unicode you are using - there are many version-
dependent issues about that).

It is not an error because it is a feature for Oracle to be able to
handle different character sets. That puts it upon you to understand
the implications. Please read the docs about globalization, as well
as the metalink docs that help you understand NLS. It can get quite
involved, though usually the answer for a particular situation winds
up being simple.

jg
--
@home.com is bogus.
http://paulschreiber.com/blog/2008/0...n-translation/


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:26 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com