View Single Post

   
  #7 (permalink)  
Old 02-28-2008, 10:20 AM
Jerry Stuckle
 
Posts: n/a
Default Re: Importing from non-delineated text?

lyonnyte wrote:
> On Jul 31, 11:52 am, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> Peter H. Coffin wrote:
>>> On Tue, 31 Jul 2007 12:56:50 -0400, Jerry Stuckle wrote:
>>>> lyonnyte wrote:
>>>>> Is there a way to import using LOAD DATA INFILE when the file is non-
>>>>> delineated?
>>>> How do you know where one field ends and another one begins?
>>> Often, in general, this kind of thing is done by column position. Which
>>> makes the answer, I believe, no.

>> Yes, that's a possibility (and one I'm surprised MySQL doesn't support).
>> But are there others?
>>
>> For instance, I've seen cases where you have a mix of numbers and text.
>> Numbers are separated by blanks, while text is enclosed in quotes.
>> Similar to CSV, except there is no comma between the values. What a
>> pain to parse :-)
>>
>> --
>> ==================
>> Remove the "x" from my email address
>> Jerry Stuckle
>> JDS Computer Training Corp.
>> jstuck...@attglobal.net
>> ==================

>
> Running into the same problem as Jerry, it's a mix of numbers and
> text. Let's say the file contains these 2 records.
>
> "John Doe" 38 "123 Spooner Street" 126-67-8888 "Jane Doe" 36 "124
> Spooner Street" 121-52-9982
>


Yep, this makes it more difficult.

Can you ever have embedded quote marks? If not, your work is much
easier - you can use a regex to do it.

If you may have embedded quotes, how are they identified? Are they
doubled (i.e. ""), or escaped (i.e. \")?

If they're doubled (like mine were) it's a bit harder. I wasn't able to
come up with a decent regex to handle it, but then I'm not a regex
expert, either. I ended up writing a short PHP program which parsed the
data, and since my files were small (a few thousand rows max), I just
inserted them right then. But if the files are larger, you could write
it out in CSV format and use LOAD DATA INFILE.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Reply With Quote