This is a discussion on Re: [GENERAL] plperl and regexps with accented characters- incompatible? within the pgsql Hackers forums, part of the PostgreSQL category; --> Andrew Dunstan wrote: > > > > Greg Sabino Mullane wrote: >> >> Yes, we might want to consider ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Andrew Dunstan wrote: > > > > Greg Sabino Mullane wrote: >> >> Yes, we might want to consider making utf8 come pre-loaded for >> plperl. There is no direct or easy way to do it (we don't have >> finer-grained control than the 'require' opcode), but we could >> probably dial back restrictions, 'use' it, and then reset the Safe >> container to its defaults. Not sure what other problems that may >> cause, however. CCing to hackers for discussion there. >> >> >> > > UTF8 is automatically on for strings passed to plperl if the db > encoding is UTF8. That includes the source text. Please be more > precise about what you want. > > BTW, the perl docs say this about the utf8 pragma: > > Do not use this pragma for anything else than telling Perl that > your > script is written in UTF-8. > > There should be no need to do that - we will have done it for you. So > any attempt to use the utf8 pragma in plperl code is probably broken > anyway. > > Ugh, in testing I see some nastiness here without any explicit require. It looks like there's an implicit require if the text contains certain chars. I'll see what I can do to fix the bug, although I'm not sure if it's possible. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Andrew Dunstan wrote: > > > Ugh, in testing I see some nastiness here without any explicit > require. It looks like there's an implicit require if the text > contains certain chars. I'll see what I can do to fix the bug, > although I'm not sure if it's possible. > > Looks like it's going to be very hard, unless someone has some brilliant insight I'm missing :-( Maybe we need to consult the perl coders. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > Ugh, in testing I see some nastiness here without any explicit > require. It looks like there's an implicit require if the text > contains certain chars. Exactly. > Looks like it's going to be very hard, unless someone has some > brilliant insight I'm missing :-( The only way I see around it is to do: $PLContainer->permit('require'); .... $PLContainer->reval('use utf8;'); .... $PLContainer->deny('require');" Not ideal. Part of me says we do this because something like //i shouldn't suddenly fail just because you added an accented character. The other part of me says to just have people use plperlu. At the very least, we should probably mention it in the docs as a gotcha. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200711132155 http://biglumber.com/x/web?pk=2529DF...9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFHOmQLvJuQZxSWSsgRA6bJAKDX9tN6ridD6aP8PywuUO UKRnHFvQCeJizW Rcq+43grmuckX1I4Rm75eTU= =3cmn -----END PGP SIGNATURE----- ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Greg Sabino Mullane wrote: >> Ugh, in testing I see some nastiness here without any explicit >> require. It looks like there's an implicit require if the text >> contains certain chars. >> > > Exactly. > > >> Looks like it's going to be very hard, unless someone has some >> brilliant insight I'm missing :-( >> > > The only way I see around it is to do: > > $PLContainer->permit('require'); > ... > $PLContainer->reval('use utf8;'); > ... > $PLContainer->deny('require');" > > Not ideal. I tried something like that briefly and it failed. The trouble is, I think, that since the engine tries a require it fails on the op test before it even looks to see if the module is already loaded. If you have made something work then please show me, no matter how grotty. > Part of me says we do this because something like //i > shouldn't suddenly fail just because you added an accented > character. The other part of me says to just have people use plperlu. > At the very least, we should probably mention it in the docs as > a gotcha. > > I think we should search harder for a solution, but I don't have time right now. If you want to submit a warning for the docs in a patch we can get that in. cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Andrew Dunstan <andrew@dunslane.net> writes: > I tried something like that briefly and it failed. The trouble is, I > think, that since the engine tries a require it fails on the op test > before it even looks to see if the module is already loaded. I think we have little choice but to report this as a Perl bug. It essentially means that a "safe" interpreter cannot decide to preload modules that it thinks are safe; and to add insult to injury, the engine is apparently trying to require utf8 in some very low-level, hidden-behind-the-scenes place, yet using high-level trappable operations to do that. Maybe those are two different bugs. Either utf8 is part of the Perl core or it isn't; you can't have it both ways. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Just as a followup, I reported this as a bug and it is being looked at and discussed: http://rt.perl.org/rt3//Public/Bug/D....html?id=47576 Appears there is no easy resolution yet. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200711281358 http://biglumber.com/x/web?pk=2529DF...9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFHTbpxvJuQZxSWSsgRA+BqAJ9Q1KB6w4ow7GyqXTY3Et ZvJRrdkgCfVXlb yC/EaTWPOI6SpvBSRBXTC7s= =LA+E -----END PGP SIGNATURE----- ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Andrew Dunstan <andrew@dunslane.net> writes: > + * Fill in just enough information to set up this perl > + * function in the safe container and call it. > + * For some reason not entirely clear, it prevents errors that > + * can arise from the regex code later trying to load > + * utf8 modules. How many versions of Perl have you tried this against? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| ||||
| On Thu, 29 Nov 2007, Andrew Dunstan wrote: > The version I tested against is 5.8.8 - the latest stable release. The > 5.8 series started in 2003 from what I can see - if anyone has a > sufficiently old system that they can test on 5.6.2 that will be useful. I've got a 5.6.1 perl here, but it wasn't built shared, so I can't test plperl. I ran the test case Greg posted to the perl bug tracker and it doesn't fail, so unless you're concerned that your change will break 5.6, then it doesn't look like 5.6 needs a fix. Kris Jurka ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |