This is a discussion on aggregate query within the pgsql Sql forums, part of the PostgreSQL category; --> I have a table CREATE TABLE survey_load ( meter_id character(5) NOT NULL, number_of_bays integer NOT NULL, bay_1_use integer, bay_2_use ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have a table CREATE TABLE survey_load ( meter_id character(5) NOT NULL, number_of_bays integer NOT NULL, bay_1_use integer, bay_2_use integer, bay_3_use integer, bay_4_use integer, bay_5_use integer, date date NOT NULL, inspection_id integer NOT NULL DEFAULT, ) How do i present an aggregate query inspection_id | meter_id | bay_use 1 12345 (value of bay_1_use) 1 12345 (value of bay_2_use) 1 12345 (value of bay_3_use) 2 23456 (value of bay_1_use) 2 23456 (value of bay_2_use) 2 23456 (value of bay_3_use) 2 23456 (value of bay_4_use) 2 23456 (value of bay_5_use) Thanks ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Raj A wrote: > I have a table > > CREATE TABLE survey_load > ( > meter_id character(5) NOT NULL, > number_of_bays integer NOT NULL, > bay_1_use integer, > bay_2_use integer, > bay_3_use integer, > bay_4_use integer, > bay_5_use integer, > date date NOT NULL, > inspection_id integer NOT NULL DEFAULT, > ) > > How do i present an aggregate query > > inspection_id | meter_id | bay_use > 1 12345 (value of bay_1_use) > 1 12345 (value of bay_2_use) > 1 12345 (value of bay_3_use) > 2 23456 (value of bay_1_use) > 2 23456 (value of bay_2_use) > 2 23456 (value of bay_3_use) > 2 23456 (value of bay_4_use) > 2 23456 (value of bay_5_use) Firstly, the table definition you provided has an invalid specification for the inspection_id column. For my examples, I simply used a "NOT NULL" constraint, and did not implement any DEFAULT clause. If I understand your issue correctly, it seems like the denormalized nature of your table is causing you some problems. With the table definition and output you provided, I am assuming the following contents for the survey_load table: meter_id | number_of_bays | bay_1_use | bay_2_use | bay_3_use | bay_4_use | bay_5_use | date | inspection_id ----------+----------------+-----------+-----------+-----------+-----------+-----------+------------+--------------- 12345 | 3 | 11 | 12 | 13 | | | 2007-05-29 | 1 23456 | 5 | 21 | 22 | 23 | 24 | 25 | 2007-05-29 | 2 (2 rows) The value of the number_of_bays column is driving the number of results in your summary query, as opposed to allowing the data relations to drive that on their own with a normalized representation. You could normalize your data representation into 2 tables: tmp1=> \d+ new_survey_load Table "new_survey_load" Column | Type | Modifiers | Description ---------------+--------------+-----------+------------- survey_id | integer | not null | meter_id | character(5) | not null | date | date | not null | inspection_id | integer | not null | Indexes: "new_survey_load_pkey" PRIMARY KEY, btree (survey_id) Has OIDs: no tmp1=> \d+ new_bay_use Table "new_bay_use" Column | Type | Modifiers | Description ------------+---------+-----------+------------- survey_id | integer | not null | bay_number | integer | not null | bay_use | integer | not null | Indexes: "ind_new_bay_use__survey_id" btree (survey_id) Foreign-key constraints: "new_bay_use_survey_id_fkey" FOREIGN KEY (survey_id) REFERENCES new_survey_load(survey_id) Has OIDs: no This normalization also makes it easier in the future to track additional bays without having to alter the table structure. Following your example from above, those tables would contain the following values: tmp1=> select * from new_survey_load ; survey_id | meter_id | date | inspection_id -----------+----------+------------+--------------- 13 | 12345 | 2007-05-29 | 1 14 | 23456 | 2007-05-29 | 2 (2 rows) tmp1=> select * from new_bay_use ; survey_id | bay_number | bay_use -----------+------------+--------- 13 | 1 | 11 13 | 2 | 12 13 | 3 | 13 14 | 1 | 21 14 | 2 | 22 14 | 3 | 23 14 | 4 | 24 14 | 5 | 25 (8 rows) The query to get the summary results you are looking for would then be: select l.inspection_id, l.meter_id, u.bay_use from new_survey_load l join new_bay_use u on l.survey_id = u.survey_id order by l.inspection_id, l.meter_id, u.bay_number; You could also use a view to join the 2 new tables together to match the format of your original table definition: select l.meter_id, count(u.bay_number) as number_of_bays, u1.bay_use as bay_1_use, u2.bay_use as bay_2_use, u3.bay_use as bay_3_use, u4.bay_use as bay_4_use, u5.bay_use as bay_5_use, l.date, l.inspection_id from new_survey_load l join new_bay_use u on l.survey_id = u.survey_id left join new_bay_use u1 on l.survey_id = u1.survey_id and u1.bay_number = 1 left join new_bay_use u2 on l.survey_id = u2.survey_id and u2.bay_number = 2 left join new_bay_use u3 on l.survey_id = u3.survey_id and u3.bay_number = 3 left join new_bay_use u4 on l.survey_id = u4.survey_id and u4.bay_number = 4 left join new_bay_use u5 on l.survey_id = u5.survey_id and u5.bay_number = 5 group by 1,3,4,5,6,7,8,9; If you add more bays for tracking, you can simply add additional joins against new_bay_use to mimic what you would have had in your original table structure. Hope this helps. Andrew ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| "Andrew Kroeger" <andrew@sprocks.gotdns.com> writes: > Raj A wrote: >> I have a table >> >> CREATE TABLE survey_load >> ( >> meter_id character(5) NOT NULL, >> number_of_bays integer NOT NULL, >> bay_1_use integer, >> bay_2_use integer, >> bay_3_use integer, >> bay_4_use integer, >> bay_5_use integer, >> date date NOT NULL, >> inspection_id integer NOT NULL DEFAULT, >> ) >> >> How do i present an aggregate query >> >> inspection_id | meter_id | bay_use >> 1 12345 (value of bay_1_use) >> 1 12345 (value of bay_2_use) >> 1 12345 (value of bay_3_use) >> 2 23456 (value of bay_1_use) >> 2 23456 (value of bay_2_use) >> 2 23456 (value of bay_3_use) >> 2 23456 (value of bay_4_use) >> 2 23456 (value of bay_5_use) > > If I understand your issue correctly, it seems like the denormalized > nature of your table is causing you some problems. True. Normalizing the tables would make this query easier which is a good sign that that's probably the right direction. If for some reason you can't or won't change the table definition there are a number of possible tricky answers given the current definition. Something like this for example: SELECT inspection_id, meter_id, case when bay=1 then bay_1_use when bay=2 then bay_2_use when bay=3 then bay_3_use when bay=4 then bay_4_use when bay=5 then bay_5_use else null end AS bay_use FROM ( SELECT *, generate_series(1,number_of_bays) AS bay FROM survey_load ) as x -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| Thank you guys! I'm currently migrating an Oracle database to postgres and have created tables using the scripts that were readily available. Glad I can now improve this old system. On 29/05/07, Gregory Stark <stark@enterprisedb.com> wrote: > "Andrew Kroeger" <andrew@sprocks.gotdns.com> writes: > > > Raj A wrote: > >> I have a table > >> > >> CREATE TABLE survey_load > >> ( > >> meter_id character(5) NOT NULL, > >> number_of_bays integer NOT NULL, > >> bay_1_use integer, > >> bay_2_use integer, > >> bay_3_use integer, > >> bay_4_use integer, > >> bay_5_use integer, > >> date date NOT NULL, > >> inspection_id integer NOT NULL DEFAULT, > >> ) > >> > >> How do i present an aggregate query > >> > >> inspection_id | meter_id | bay_use > >> 1 12345 (value of bay_1_use) > >> 1 12345 (value of bay_2_use) > >> 1 12345 (value of bay_3_use) > >> 2 23456 (value of bay_1_use) > >> 2 23456 (value of bay_2_use) > >> 2 23456 (value of bay_3_use) > >> 2 23456 (value of bay_4_use) > >> 2 23456 (value of bay_5_use) > > > > > > If I understand your issue correctly, it seems like the denormalized > > nature of your table is causing you some problems. > > True. Normalizing the tables would make this query easier which is a good sign > that that's probably the right direction. > > If for some reason you can't or won't change the table definition there are a > number of possible tricky answers given the current definition. Something like > this for example: > > SELECT inspection_id, meter_id, > case when bay=1 then bay_1_use > when bay=2 then bay_2_use > when bay=3 then bay_3_use > when bay=4 then bay_4_use > when bay=5 then bay_5_use > else null > end AS bay_use > FROM ( > SELECT *, generate_series(1,number_of_bays) AS bay > FROM survey_load > ) as x > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |