vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I'm noticing that some of my data has been imported as junk text: For instance: klciã«" What would be the SQL to find data of this nature? My column can only have alphanumeric data, and the only symbols allowed are "-" and "_", so I tried this regexp query: select id, t_code from traders where t_code ~ '[^A-Za-z1-9\-]' limit 100; But this starts to return values such as "181xn-807199" which is valid as per the above regexp? Also, when I try to include the underscore, as follows... select id, t_code from traders where t_code ~ '[^A-Za-z1-9\-\_]' limit 100; This gives me an error: "ERROR: invalid regular expression: invalid character range". What am I missing? Does this have something to do with erroneous encodings? I want my data to be utf-8 but I do want to find it with latin1 queries when the text in columns is supposed to be only latin1 characters! Or is "a-z" in utf-8 considered different from "a-z" in latin1? ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| "Phoenix Kiula" <phoenix.kiula@gmail.com> writes: > > select id, t_code > from traders > where t_code ~ '[^A-Za-z1-9\-\_]' > limit 100; > > This gives me an error: "ERROR: invalid regular expression: invalid > character range". Put the dash at the start of the character class: [^-A-Za-z1-9_] > > What am I missing? In a character class expression the dash has an special meaning. If you need to match a dash it has to be the first character. Regards, Manuel. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Aug 17, 2007, at 10:58 , Phoenix Kiula wrote: > What would be the SQL to find data of this nature? My column can only > have alphanumeric data, and the only symbols allowed are "-" and "_", > so I tried this regexp query: > > select id, t_code > from traders > where t_code ~ '[^A-Za-z1-9\-]' If you're including - in a range as a character, doesn't it have to go first? Try this: WHERE t_code ~ $re$[^-A-Za-z1-9_]$re$ Michael Glaesemann grzm seespotcode net ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Fri, 17 Aug 2007, Michael Glaesemann wrote: > > On Aug 17, 2007, at 10:58 , Phoenix Kiula wrote: > >> What would be the SQL to find data of this nature? My column can only >> have alphanumeric data, and the only symbols allowed are "-" and "_", >> so I tried this regexp query: >> >> select id, t_code >> from traders >> where t_code ~ '[^A-Za-z1-9\-]' > > If you're including - in a range as a character, doesn't it have to go first? > Try this: > > WHERE t_code ~ $re$[^-A-Za-z1-9_]$re$ > > Michael Glaesemann > grzm seespotcode net > How about WHERE t_code ~ $re$[^-A-Za-z0-9_]$re$ So that zeros are allowed? Belinda ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| [Please reply to the list so that others may benefit from and participate in the discussion.] On Aug 17, 2007, at 12:50 , Phoenix Kiula wrote: > On 18/08/07, Michael Glaesemann <grzm@seespotcode.net> wrote: > >> >> On Aug 17, 2007, at 10:58 , Phoenix Kiula wrote: >> >> >>> What would be the SQL to find data of this nature? My column can >>> only >>> have alphanumeric data, and the only symbols allowed are "-" and >>> "_", >>> so I tried this regexp query: >>> >>> select id, t_code >>> from traders >>> where t_code ~ '[^A-Za-z1-9\-]' >>> >> >> If you're including - in a range as a character, doesn't it have to >> go first? >> Try this: >> >> WHERE t_code ~ $re$[^-A-Za-z1-9_]$re$ >> >> > > > > Thanks, yes, this is sweet! > > If I include this into a check constraint on the table, would that be > very resource intensive for INSERTs and UPDATEs? > Maybe. I don't know. What's very? Measure, change, and measure again. Premature optimization and all that. Michael Glaesemann grzm seespotcode net ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|