Unix Technical Forum

find out data types using sql or php

This is a discussion on find out data types using sql or php within the Pgsql General forums, part of the PostgreSQL category; --> Hey guys. This is probably a n00b question, but here goes anyway. I have a set of csv files ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql General

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-09-2008, 11:08 PM
Tom Hart
 
Posts: n/a
Default find out data types using sql or php

Hey guys. This is probably a n00b question, but here goes anyway.

I have a set of csv files that I COPY t o a number of import tables
(same field layout as live tables, but with all datatypes 'text') then
use an INSERT INTO ... SELECT FROM statement to transfer the rows over
to the live table (at times filtering out, at times not). Unfortunately
if any of the data is not type perfect (letters in number fields, etc.)
then the entire query bombs and nothing gets loaded. What I'd like to do
is add a field is_ok and then use sql or php (or whatever else, if
there's an easier way) to determine the field datatype (text, numeric,
bool, etc.) and then use some regex or something along those lines to
attempt to verify that the data is good, and then mark the is_ok field
(obviously a bool) as true, and use is_ok = TRUE in the insert/select
statement. Can somebody give me a push in the right direction?

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-09-2008, 11:08 PM
Webb Sprague
 
Posts: n/a
Default Re: find out data types using sql or php

> ... to determine the field datatype (text, numeric,
> bool, etc.)


I am not sure if this helps, but you can dig around in the system
files (pg_catalog.*), and probably write a query that gets the types
of every column in the data table you want to insert to.

Not a big push, but maybe it will get you started. I don't have links
at hand -- sorry.

> and then use some regex or something along those lines to
> attempt to verify that the data is good, and then mark the is_ok field
> (obviously a bool) as true, and use is_ok = TRUE in the insert/select
> statement. Can somebody give me a push in the right direction?


This sounds like a lot of programming, but that you have the right
idea. I am not sure how you would use the "is_ok" within SQL; I would
probably only try to insert data that is ok, and filter that in the
application.

I might also think in terms of wrapping everything in a transaction,
assuming it is all good, and then rolling back and catching your
exception and giving the user a decent error -- such as "reformat your
spreadsheet, doofus, and try uploading again"... Very doable with
psycopg2 and python.

-W
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-09-2008, 11:08 PM
Tom Hart
 
Posts: n/a
Default Re: find out data types using sql or php

Replying to yourself is so depressing...

Anyway, I managed to google myself into a solution, I just wanted to
share it with the list in case anybody else was interested.

Using the INFORMATION SCHEMA and a query like
SELECT column_name, data_type FROM information_schema.columns WHERE
table_name = 'table';

I get results similar to

column_name | data_type

-------------------+-----------

atm_acct_mess | text

atm_acct_num | numeric

atm_acct_tp1 | text

atm_acct_tp2 | text

atm_acct_tp3 | text

atm_acct_tp4 | text

atm_acct_tp5 | text

atm_acct_tp6 | text

atm_acct1_stcd | text

atm_acct2_stcd | text

atm_acct3_stcd | text

atm_acct4_stcd | text

atm_acct5_stcd | text

atm_acct6_stcd | text

atm_atm/ach_cd | integer

atm_atm/ach_id | numeric

atm_atm/ach_tp | integer

atm_cn_num | integer

atm_date_opened | date

atm_id1 | text

atm_id2 | text

atm_id3 | text

atm_id4 | text

atm_id5 | text

atm_id6 | text

atm_last_act_date | date

atm_next_rec | integer

atm_stat_cd | integer

atm_trn_acct_id | text

atm_trn_acct_num | numeric

atm_trn_acct_tp | text

atm_trn_cn_num | integer

atm_trn_date | date

atm_trn_reg_e | integer

atm_trn_term_id | text

atm_trn_trace | text

atm_trn_trn_num | integer

(37 rows)


Which I can then of course parse with php and do some testing from there.

I hope this helps somebody, I know I could have used this information
about 20 minutes ago :-)

Tom Hart wrote:
> Hey guys. This is probably a n00b question, but here goes anyway.
>
> I have a set of csv files that I COPY t o a number of import tables
> (same field layout as live tables, but with all datatypes 'text') then
> use an INSERT INTO ... SELECT FROM statement to transfer the rows over
> to the live table (at times filtering out, at times not).
> Unfortunately if any of the data is not type perfect (letters in
> number fields, etc.) then the entire query bombs and nothing gets
> loaded. What I'd like to do is add a field is_ok and then use sql or
> php (or whatever else, if there's an easier way) to determine the
> field datatype (text, numeric, bool, etc.) and then use some regex or
> something along those lines to attempt to verify that the data is
> good, and then mark the is_ok field (obviously a bool) as true, and
> use is_ok = TRUE in the insert/select statement. Can somebody give me
> a push in the right direction?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-09-2008, 11:08 PM
Michael Glaesemann
 
Posts: n/a
Default Re: find out data types using sql or php


On Oct 26, 2007, at 16:06 , Tom Hart wrote:

> What I'd like to do is add a field is_ok and then use sql or php
> (or whatever else, if there's an easier way) to determine the field
> datatype (text, numeric, bool, etc.) and then use some regex or
> something along those lines to attempt to verify that the data is
> good, and then mark the is_ok field (obviously a bool) as true, and
> use is_ok = TRUE in the insert/select statement.


This is one way to do it. Another option would be to process the csv
file beforehand using your favorite scripting language and COPY into
tables with the appropriate, expected datatypes. You know you haven't
processed the text file adequately if the COPY fails.

If you're going to do the cleaning inside the database (i.e., all
text fields), I'd stay inside the database using SQL and PL functions
rather than connecting to the database via some external script,
process the data, and stick it back in: that's just added overhead,
and if the process is automated enough to the point you can write a
script to handle it, pretty much anything you can do in an external
script you can do inside the database without the overhead of round-
tripping out and back.

However, if the cleaning is going to take significant user
interaction, you might consider using a simple web app that would
connect to the database so a user could view and clean the data. I
did this with some success for manually checking if names in an
imported file were properly split along given and family name lines.

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 12:36 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com