This is a discussion on Re: Patch for collation using ICU within the pgsql Hackers forums, part of the PostgreSQL category; --> Useful if it's going to support earlier releases of ICU.... Not all os's come with ICU3.2, debian for example, ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Useful if it's going to support earlier releases of ICU.... Not all os's come with ICU3.2, debian for example, currently has 2.1 in testing, and 2.6 in unstable. .... John > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto > Palle Girgensohn > Sent: Friday, March 25, 2005 10:40 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] Patch for collation using ICU > > Hi! > > I've put together a patch for using IBM's ICU package for collation. > > If your OS does not have full support for collation ur > uppercase/lowercase in multibyte locales, this might be > useful. If you are using a multibyte character encoding in > your database and want collation, i.e. order by, and also > lower(), upper() and initcap() to work properly, this patch > will do just that. > > This patch is needed for FreeBSD, since this OS has no > support for collation of for example unicode locales (that > is, wcscoll(3) does not do what you expect if you set > LC_ALL=sv_SE.UTF-8, for example). AFAIK the patch is *not* > necessary for Linux, although IBM claims ICU collation to be > about twice as fast as glibc for simple western locales. > > It adds a configure switch, `--with-icu', which will set up > the code to use ICU instead of wchar_t and wcscoll. > > This has been tested only on FreeBSD-4.11 & FreeBSD-5-stable, > where it seems to run well. I've not had the time to do any > comparative performance tests yet, but it seems it is at > least not slower than using LATIN1 with > sv_SE.ISO8859-1 locale, perhaps even faster. > > I'd be delighted if some more experienced postgresql hackers > would review this stuff. The patch is pretty compact, so it's > fast reading > (tagged "experimental") to FreeBSD's postgresql port. Any > ideas about whether this is a good idea or not? > > Any thoughts or ideas are welcome! > > Cheers, > Palle > > Patch at: > <http://people.freebsd.org/~girgen/po...u/pg-801-icu-2 005-03-14.diff> > > ICU at sourceforge: <http://icu.sf.net/> > > > ---------------------------(end of > broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > > ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| --On fredag, mars 25, 2005 16.34.41 +1100 John Hansen <john@geeknet.com.au> wrote: > Useful if it's going to support earlier releases of ICU.... > > Not all os's come with ICU3.2, debian for example, currently has 2.1 in > testing, and 2.6 in unstable. Oh, OK. FreeBSD has only the 3.2 as port. I can check the older version, I doubt it would too much difference. Some autoconf sorcery needed, perhaps. /Palle > > ... John > >> -----Original Message----- >> From: pgsql-hackers-owner@postgresql.org >> [mailto >> Palle Girgensohn >> Sent: Friday, March 25, 2005 10:40 AM >> To: pgsql-hackers@postgresql.org >> Subject: [HACKERS] Patch for collation using ICU >> >> Hi! >> >> I've put together a patch for using IBM's ICU package for collation. >> >> If your OS does not have full support for collation ur >> uppercase/lowercase in multibyte locales, this might be >> useful. If you are using a multibyte character encoding in >> your database and want collation, i.e. order by, and also >> lower(), upper() and initcap() to work properly, this patch >> will do just that. >> >> This patch is needed for FreeBSD, since this OS has no >> support for collation of for example unicode locales (that >> is, wcscoll(3) does not do what you expect if you set >> LC_ALL=sv_SE.UTF-8, for example). AFAIK the patch is *not* >> necessary for Linux, although IBM claims ICU collation to be >> about twice as fast as glibc for simple western locales. >> >> It adds a configure switch, `--with-icu', which will set up >> the code to use ICU instead of wchar_t and wcscoll. >> >> This has been tested only on FreeBSD-4.11 & FreeBSD-5-stable, >> where it seems to run well. I've not had the time to do any >> comparative performance tests yet, but it seems it is at >> least not slower than using LATIN1 with >> sv_SE.ISO8859-1 locale, perhaps even faster. >> >> I'd be delighted if some more experienced postgresql hackers >> would review this stuff. The patch is pretty compact, so it's >> fast reading >> (tagged "experimental") to FreeBSD's postgresql port. Any >> ideas about whether this is a good idea or not? >> >> Any thoughts or ideas are welcome! >> >> Cheers, >> Palle >> >> Patch at: >> <http://people.freebsd.org/~girgen/po...u/pg-801-icu-2 > 005-03-14.diff> >> >> ICU at sourceforge: <http://icu.sf.net/> >> >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 7: don't forget to increase your free space map settings >> >> ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend |