Unix Technical Forum

REGEXP Clarification

This is a discussion on REGEXP Clarification within the MySQL forums, part of the Database Server Software category; --> Greetings: I am trying to locate incorrect data in a column using REGEXP. Only numeric data is allowed but ...


Go Back   Unix Technical Forum > Database Server Software > MySQL

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-28-2008, 08:45 AM
Jeff Gardner
 
Posts: n/a
Default REGEXP Clarification

Greetings:

I am trying to locate incorrect data in a column using REGEXP. Only
numeric data is allowed but I am importing data, some of which is not
properly formatted. I am using:

SELECT *
FROM `table`
WHERE `column` REGEXP "[^[:digit:]]";

to locate any value that is not a number. I've tried several variations
and I get an empty set back. I know that there are illegal characters
in some of the data but I can't seem to locate it using this query. The
field can contain anywhere from 1 to 6 digits, if that is relevant.
Advice is much appreciated.
--

Regards,

Jeff Gardner
___________________________

"Contrary to popular belief, Unix is user friendly. It just happens
to be very selective about who its friends are." --Kyle Hearn
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-28-2008, 08:45 AM
Radoulov, Dimitre
 
Posts: n/a
Default Re: REGEXP Clarification


"Jeff Gardner" wrote...
> Greetings:
>
> I am trying to locate incorrect data in a column using REGEXP. Only
> numeric data is allowed but I am importing data, some of which is not
> properly formatted. I am using:
>
> SELECT *
> FROM `table`
> WHERE `column` REGEXP "[^[:digit:]]";
>
> to locate any value that is not a number. I've tried several variations
> and I get an empty set back. I know that there are illegal characters in
> some of the data but I can't seem to locate it using this query. The
> field can contain anywhere from 1 to 6 digits, if that is relevant. Advice
> is much appreciated.


Could you post an example of the "invalid" data,
on my 5.1.12 you query works.


Regards
Dimitre


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-28-2008, 08:45 AM
Jeff Gardner
 
Posts: n/a
Default Re: REGEXP Clarification

Radoulov, Dimitre wrote:
> "Jeff Gardner" wrote...
>> Greetings:
>>
>> I am trying to locate incorrect data in a column using REGEXP. Only
>> numeric data is allowed but I am importing data, some of which is not
>> properly formatted. I am using:
>>
>> SELECT *
>> FROM `table`
>> WHERE `column` REGEXP "[^[:digit:]]";
>>
>> to locate any value that is not a number. I've tried several variations
>> and I get an empty set back. I know that there are illegal characters in
>> some of the data but I can't seem to locate it using this query. The
>> field can contain anywhere from 1 to 6 digits, if that is relevant. Advice
>> is much appreciated.

>
> Could you post an example of the "invalid" data,
> on my 5.1.12 you query works.
>
>
> Regards
> Dimitre
>
>

The program that I am importing into tells me that one of the fields
contains a period. I am using mysql from the command line to "sanitize"
the data before final import. If I find it, I'll post it.

--

Regards,

Jeff Gardner
___________________________

"Contrary to popular belief, Unix is user friendly. It just happens
to be very selective about who its friends are." --Kyle Hearn
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-28-2008, 08:45 AM
Jeff Gardner
 
Posts: n/a
Default Re: REGEXP Clarification

Jeff Gardner wrote:
> Radoulov, Dimitre wrote:
>> "Jeff Gardner" wrote...
>>> Greetings:
>>>
>>> I am trying to locate incorrect data in a column using REGEXP. Only
>>> numeric data is allowed but I am importing data, some of which is not
>>> properly formatted. I am using:
>>>
>>> SELECT *
>>> FROM `table`
>>> WHERE `column` REGEXP "[^[:digit:]]";
>>>
>>> to locate any value that is not a number. I've tried several
>>> variations and I get an empty set back. I know that there are
>>> illegal characters in some of the data but I can't seem to locate it
>>> using this query. The field can contain anywhere from 1 to 6 digits,
>>> if that is relevant. Advice is much appreciated.

>>
>> Could you post an example of the "invalid" data,
>> on my 5.1.12 you query works.
>>
>>
>> Regards
>> Dimitre
>>

> The program that I am importing into tells me that one of the fields
> contains a period. I am using mysql from the command line to "sanitize"
> the data before final import. If I find it, I'll post it.
>

It turns out that there were 13 rows with a NULL value in the field in
question. NULL is not whitespace, nor is it digits. Any explanation as
to why the above query returned no rows?

--

Regards,

Jeff Gardner
___________________________

"Contrary to popular belief, Unix is user friendly. It just happens
to be very selective about who its friends are." --Kyle Hearn
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-28-2008, 08:45 AM
Radoulov, Dimitre
 
Posts: n/a
Default Re: REGEXP Clarification


"Jeff Gardner" wrote...
> Jeff Gardner wrote:
>> Radoulov, Dimitre wrote:
>>> "Jeff Gardner" wrote...
>>>> Greetings:
>>>>
>>>> I am trying to locate incorrect data in a column using REGEXP. Only
>>>> numeric data is allowed but I am importing data, some of which is not
>>>> properly formatted. I am using:
>>>>
>>>> SELECT *
>>>> FROM `table`
>>>> WHERE `column` REGEXP "[^[:digit:]]";
>>>>
>>>> to locate any value that is not a number. I've tried several
>>>> variations and I get an empty set back. I know that there are illegal
>>>> characters in some of the data but I can't seem to locate it using this
>>>> query. The field can contain anywhere from 1 to 6 digits, if that is
>>>> relevant. Advice is much appreciated.

[...]
> It turns out that there were 13 rows with a NULL value in the field in
> question. NULL is not whitespace, nor is it digits. Any explanation as
> to why the above query returned no rows?


Because of its special meaning, I suppose.

"Conceptually, NULL means "a missing unknown value" and it is treated
somewhat differently from other values. "

Unknown means that you cannot _know_ whether it's a digit or something else
(btw, whitespaces are handled correctly with your example query).

So,
a) Use constraints (not null)
or
b) Add "column is not null" in your query


Regards
Dimitre


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-28-2008, 08:45 AM
Radoulov, Dimitre
 
Posts: n/a
Default Re: REGEXP Clarification


"Radoulov, Dimitre" wrote...
>
> "Jeff Gardner" wrote...
>> Jeff Gardner wrote:
>>> Radoulov, Dimitre wrote:
>>>> "Jeff Gardner" wrote...
>>>>> Greetings:
>>>>>
>>>>> I am trying to locate incorrect data in a column using REGEXP. Only
>>>>> numeric data is allowed but I am importing data, some of which is not
>>>>> properly formatted. I am using:
>>>>>
>>>>> SELECT *
>>>>> FROM `table`
>>>>> WHERE `column` REGEXP "[^[:digit:]]";
>>>>>
>>>>> to locate any value that is not a number. I've tried several
>>>>> variations and I get an empty set back. I know that there are illegal
>>>>> characters in some of the data but I can't seem to locate it using
>>>>> this query. The field can contain anywhere from 1 to 6 digits, if
>>>>> that is relevant. Advice is much appreciated.

> [...]
>> It turns out that there were 13 rows with a NULL value in the field in
>> question. NULL is not whitespace, nor is it digits. Any explanation as
>> to why the above query returned no rows?

>
> Because of its special meaning, I suppose.
>
> "Conceptually, NULL means "a missing unknown value" and it is treated
> somewhat differently from other values. "




> Unknown means that you cannot _know_ whether it's a digit or something
> else (btw, whitespaces are handled correctly with your example query).



I ment "as you very likely expect", not "correctly"


Regards
Dimitre


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:03 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com