roch77@gmail.com wrote:
> Hi,
> I just loaded 71,000 records using "load data infile" and when done it
> gave no warnings, but said it skipped 4.
The docs refer to two cases where input rows may be "skipped." It's not
clear which case contributes to the "skipped" total in the load report,
or if both do.
In one case, if you specify the IGNORE keyword in your LOAD DATA
statement, rows are skipped if they contain values that would conflict
with existing primary key values in the table.
The docs also use the term "skipped" for the case when a row of the
input file doesn't match a prefix string you specify in your LOAD DATA
statement. Here's an excerpt from the doc page
http://dev.mysql.com/doc/refman/5.0/en/load-data.html:
----
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test
FIELDS TERMINATED BY ',' LINES STARTING BY 'xxx';
If the data file looks like this:
xxx"abc",1
something xxx"def",2
"ghi",3
The resulting rows will be ("abc",1) and ("def",2). The third row in the
file is skipped because it does not contain the prefix.
----
To test which rows were not imported, I can suggest creating a new table
`T2` with identical structure to the table `T` into which you just
loaded the file, but without any existing data, or primary key or
foreign key constraints. Then load your file into that table, and look
for "skipped: 0" in the report. Then run a query like the following to
find mismatches:
SELECT T2.*
FROM T RIGHT OUTER JOIN T2 ON T.primarykey = T2.primarykey
WHERE T.primarykey IS NULL;
Note this will not find cases where the rows loaded into T were skipped
because of the primary key conflict case I mentioned above. There will
be records in both tables with the same primary key, but the attribute
columns may have different values. So if the query above finds no
differences, try this:
SELECT T2.*
FROM T INNER JOIN T2 ON T.primarykey = T2.primarykey
WHERE T.othercolumn <> T2.othercolumn OR T.thirdcolumn <> T2.thirdcolumn
OR ...etc...;
Regards,
Bill K.