This is a discussion on Find min year and min value within the Pgsql General forums, part of the PostgreSQL category; --> Hi there, I am trying to find in a table with different variables, countries and years the lowest year ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi there, I am trying to find in a table with different variables, countries and years the lowest year and within that year the lowest value The following SELECT works, but I wonder if it is "elegant". Can you recommend any other solution? SELECT MIN(value) AS minv FROM public_one_table.data WHERE year = ( SELECT MIN(year) AS min_year FROM public_one_table.data WHERE id_variable = 1 ) AND id_variable = 1 Thanks for any help! Stef ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Stefan Schwarzer wrote: > Hi there, > > I am trying to find in a table with different variables, countries and > years the > > lowest year > > and within that year the > > lowest value > > > The following SELECT works, but I wonder if it is "elegant". Can you > recommend any other solution? SELECT value AS minv FROM public_on_table.data WHERE id_variable = 1 ORDER BY year, value LIMIT 1 -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| >> Hi there, >> I am trying to find in a table with different variables, countries >> and years the >> lowest year >> and within that year the >> lowest value >> The following SELECT works, but I wonder if it is "elegant". Can >> you recommend any other solution? > > SELECT value AS minv FROM public_on_table.data > WHERE id_variable = 1 > ORDER BY year, value LIMIT 1 But that brings only the min value, not the min year. I need to know both of them, something like (min year = ) 1972, (min value = ) 20 Stef ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Stefan Schwarzer wrote: >>> Hi there, >>> I am trying to find in a table with different variables, countries >>> and years the >>> lowest year >>> and within that year the >>> lowest value >>> The following SELECT works, but I wonder if it is "elegant". Can you >>> recommend any other solution? >> >> SELECT value AS minv FROM public_on_table.data >> WHERE id_variable = 1 >> ORDER BY year, value LIMIT 1 > > But that brings only the min value, not the min year. I need to know > both of them, something like > > (min year = ) 1972, (min value = ) 20 SELECT year, value FROM ... -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| > SELECT year, value FROM ... I feel ashamed.... such a simple solution... gush.... Thanks for that! Unfortunately it doesn't stop there... If I want to find the "common smallest year" for two given variables (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and 1980, 1981,... 2003) for variable 2 (Fish Catch) ). It should come up with 1980 for a given country, if there is a value for that year in both variables. Otherwise 1981, etc... How would I do that? I really have no clue... (my table looks something like this: id_variable | year | value | id_country --------------------------------------- 1 | 2001 | 123 | 1 1 | 2002 | 125 | 1 1 | 2003 | 128 | 1 1 | 2004 | 132 | 1 1 | 2005 | 135 | 1 1 | 2001 | 412 | 2 1 | 2002 | 429 | 2 1 | 2003 | 456 | 2 1 | 2004 | 465 | 2 1 | 2005 | 477 | 2 ..... 2 | 1980 | 83 | 1 2 | 1981 | 89 | 1 ..... ) Thanks for any hints, Stef ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Stefan Schwarzer wrote: >> SELECT year, value FROM ... > > I feel ashamed.... such a simple solution... gush.... Thanks for that! Can be easy to over-complicate things when you've been thinking about them too long. > Unfortunately it doesn't stop there... > > If I want to find the "common smallest year" for two given variables > (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and 1980, > 1981,... 2003) for variable 2 (Fish Catch) ). It should come up with > 1980 for a given country, if there is a value for that year in both > variables. Otherwise 1981, etc... In that case you will need two subqueries, but it's just a matter of converting your description to SQL. SELECT yr1, gdp.val1 AS gdp_val, fish_catch.val2 AS fish_catch_val FROM (SELECT year AS yr1, value AS val1 FROM data WHERE id_variable = 1 ) AS gdp, (SELECT year AS yr2, value AS val2 FROM data WHERE id_variable = 2 ) AS fish_catch WHERE gdp.yr1 = fish_catch.yr2 ORDER BY gdp.yr1 LIMIT 1; Here I've aliases (renamed) the columns and the sub-queries, but I'd probably just alias the sub-queries in real-life. You could write it as a JOIN if you prefer that style, or use the MIN() aggregate (although I'd guess that the ORDER BY/LIMIT might prove faster). So, I'd perhaps use: SELECT gdp.year, gdp.val AS gdp_val, fish_catch.val AS fish_catch_val FROM (SELECT year,value FROM data WHERE id_variable=1) AS gdp JOIN (SELECT year, value FROM data WHERE id_variable=2) AS fish_catch USING (year) ORDER BY gdp.year LIMIT 1; -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Oct 2, 2007, at 9:29 , Stefan Schwarzer wrote: > How would I do that? I really have no clue... The key is to build it up in steps. select id_country, year, var_1, val_1, var_2, val_2 -- Second step: -- value for year for each country of var_1 from (select id_country, year, id_variable as var_1, "value" as val_1 from my_table) as val_1 -- value for year for each country for var_2 natural join (select id_country, year, id_variable as var_2, "value" as val_2 from my_table) as val_2 -- First step -- for each country, find the minimum common year (which the join will do) for the two -- variables you're interested in (var_1 and var_2). natural join (select id_country, var_1, var_2, min(year) as year from (select id_country, year, id_variable as var_1 from my_table) as var_1 natural join (select id_country, year, id_variable as var_2 from my_table) as var_2 group by id_country, var_1, var_2) as min_common_year where id_country = :id_country and var_1 = :var_1 and var_2 = :var_2; Check your explain analyze output: if the planner doesn't push up the :var_1, :var_2, and :id_country_id values up into subqueries, you might want to add them as where clauses. As an aside, I assume you've rewritten the table column names: if you haven't, as it's an SQL keyword, "value" is a particularly poor choice of column name. I'd probably rename "year" as well. Hope this helps. Michael Glaesemann grzm seespotcode net ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ |
| |||
| Stefan Schwarzer wrote: >> SELECT year, value FROM ... > > I feel ashamed.... such a simple solution... gush.... Thanks for that! > > Unfortunately it doesn't stop there... > > If I want to find the "common smallest year" for two given variables > (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and 1980, > 1981,... 2003) for variable 2 (Fish Catch) ). It should come up with > 1980 for a given country, if there is a value for that year in both > variables. Otherwise 1981, etc... > > How would I do that? I really have no clue... > > (my table looks something like this: > > id_variable | year | value | id_country > --------------------------------------- > 1 | 2001 | 123 | 1 > 1 | 2002 | 125 | 1 > .... > > 2 | 1980 | 83 | 1 > 2 | 1981 | 89 | 1 > .... > > ) > > Thanks for any hints, As others have noted, the query *can* be written. But it appears to me that you are struggling against your table layout. Before struggling with ever more complicated queries, I'd consider restructuring your table(s). There are many possibilities depending on the current nature of your data, how you expect it to change and the queries you expect to run against it. For example: country_id data_year gdp fish_catch Then your query may be as simple as, say: select min(year) from your_table where country_id = xxx and gdp is not null and fish_catch is not null; or select year, gdp, fish_catch from your_table where country_id = xxx and gdp is not null and fish_catch is not null order by year desc, gdp desc, fish_catch desc limit 1; Alternately, you could have a gdp table and a fish_catch table which would be easily joined to give the same result. Cheers, Steve ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ |
| |||
| On Tue, Oct 02, 2007 at 04:29:02PM +0200, Stefan Schwarzer wrote: > If I want to find the "common smallest year" for two given variables > (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and > 1980, 1981,... 2003) for variable 2 (Fish Catch) ). It should come up > with 1980 for a given country, if there is a value for that year in > both variables. Otherwise 1981, etc... > How would I do that? I really have no clue... > (my table looks something like this: > id_variable | year | value | id_country > --------------------------------------- > 1 | 2001 | 123 | 1 select min(year) from (select year from table where id_variable in (1,2) group by year having count(distinct id_variable) = 2) x; depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" http://www.depesz.com/ - blog dla ciebie (i moje CV) ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| On Oct 2, 2007, at 11:10 , Steve Crawford wrote: > As others have noted, the query *can* be written. But it appears to me > that you are struggling against your table layout. The current schema he has is commonly called EAV (entity-attribute- value) and is generally frowned upon. Now, in his particular case it may be justified if the "value" column values are actually all of the same type, such as currency amounts for each category. If this is the case, I suggest renaming the column to be more descriptive of what is actually stored: likewise the id_variable column. > Before struggling > with ever more complicated queries, I'd consider restructuring your > table(s). There are many possibilities depending on the current nature > of your data, how you expect it to change and the queries you > expect to > run against it. For example: > country_id > data_year > gdp > fish_catch This would be one way to do it. However, each time you add a new category you'd need to add a new column to the table: not very flexible. You can also have the same functionality by adding a new table for each category: > Alternately, you could have a gdp table and a fish_catch table which > would be easily joined to give the same result. Expanding on this: create table fish_catches (country text not null, data_year date not null, primary key (country, data_year), fish_catch numeric not null); create table gdp (country text not null reference countries data_year date not null, primary key (country, data_year), gdp numeric not null); This makes your queries quite simple: select country, data_year, fish_catch, gdp from fish_catches natural join gdp where country = :country order by data_year limit 1; or select country, data_year, fish_catch, gdp from fish_catches natural join gdp natural join (select country, min(data_year) as data_year from gdp natural join fish_catch group by country) min_data_year where country = :country; Splitting categories into separate tables also eliminates the necessity of worrying about NULL, which can lead to unexpected behavior if you aren't careful. Michael Glaesemann grzm seespotcode net ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|