vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| The following bug has been logged online: Bug reference: 1528 Logged by: Peter Wright Email address: pete@flooble.net PostgreSQL version: 7.4.7, 8.0.1 Operating system: Debian Linux (unstable) Description: Rows returned that should be excluded by WHERE clause Details: Hopefully this example SQL will paste correctly - I think this demonstrates the problem much better than I could explain in words. The bug is shown in the two SELECT queries with a WHERE clause. Very bizarre. The same bug crops up on 7.4.6, 7.4.7 and 8.0.1. pete@serf [07/Mar 6:28:50] pts/10 !19 ~ $ createdb test1 CREATE DATABASE pete@serf [07/Mar 6:28:59] pts/10 !20 ~ $ psql test1 Welcome to psql 7.4.7, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit test1=# create table t1 ( a smallint primary key, b smallint ) ; NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1" CREATE TABLE test1=# create table t2 ( a smallint primary key, b smallint ) ; NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2" CREATE TABLE test1=# insert into t1 values (1, 1); INSERT 118413888 1 test1=# insert into t1 values (2, 2); INSERT 118413889 1 test1=# insert into t2 values (1, 4); INSERT 118413890 1 test1=# insert into t2 values (2, 8); INSERT 118413891 1 test1=# select id, min(b) from ( select 1 as id, max(b) as b from t1 union select 2 as id, max(b) from t2 ) as q1 group by id ; id | min ----+----- 1 | 2 2 | 8 (2 rows) test1=# create view qry1 as select id, min(b) from ( select 1 as id, max(b) as b from t1 union select 2 as id, max(b) from t2 ) as q1 group by id ; CREATE VIEW test1=# select * from qry1 where id = 1; id | min ----+----- 1 | 2 2 | (2 rows) test1=# select * from qry1 where id = 2; id | min ----+----- 1 | 2 | 8 (2 rows) test1=# select * from qry1; id | min ----+----- 1 | 2 2 | 8 (2 rows) test1=# ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| "Peter Wright" <pete@flooble.net> writes: > Description: Rows returned that should be excluded by WHERE clause Interesting point. The view and union don't seem to be the issue; I think the problem can be expressed as regression=# select 2 as id, max(b) from t2 having 2 = 1; id | max ----+----- 2 | (1 row) Now, if this were a WHERE clause, I think the answer would be right: regression=# select 2 as id, max(b) from t2 where 2 = 1; id | max ----+----- 2 | (1 row) but since it's HAVING I think this is probably wrong. Looking at the EXPLAIN output regression=# explain select 2 as id, max(b) from t2 having 2 = 1; QUERY PLAN ---------------------------------------------------------------- Aggregate (cost=3.68..3.68 rows=1 width=2) -> Result (cost=0.00..3.14 rows=214 width=2) One-Time Filter: false -> Seq Scan on t2 (cost=0.00..3.14 rows=214 width=2) (4 rows) the issue is clearly that the known-false HAVING clause is pushed down inside the aggregation, as though it were WHERE. The existing code pushes down HAVING to WHERE if the clause contains no aggregates, but evidently this is too simplistic. What are the correct conditions for pushing down HAVING clauses to WHERE? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| I wrote in reference to bug#1528: > What the spec actually says, or at least implies, is that a HAVING > clause is to be evaluated only once per group --- where the "group" > is the whole table if there's no GROUP BY clause. In fact, reading the spec more closely, it is clear that the presence of HAVING turns the query into a grouped query even if there is no GROUP BY. I quote SQL92 7.8 again: 7.8 <having clause> Function Specify a grouped table derived by the elimination of groups from ^^^^^^^^^^^^^^^^^^^^^^^ the result of the previously specified clause that do not meet the <search condition>. ... 1) Let T be the result of the preceding <from clause>, <where clause>, or <group by clause>. If that clause is not a <group by clause>, then T consists of a single group and does not have a grouping column. 2) The <search condition> is applied to each group of T. The result of the <having clause> is a grouped table of those groups of T ^^^^^^^^^^^^^^^^^^ for which the result of the <search condition> is true. This is quite clear that the output of a HAVING clause is a "grouped table" no matter whether the query uses GROUP BY or aggregates or not. What that means is that neither the HAVING clause nor the targetlist can use any ungrouped columns except within aggregate calls; that is, select col from tab having 2>1 is in fact illegal per SQL spec, because col isn't a grouping column (there are no grouping columns in this query). What we are currently doing with this construct is pretending that it means select col from tab where 2>1 but it does not mean that according to the spec. As I look into this, I find that several warty special cases in the parser and planner arise from our misunderstanding of this point, and could be eliminated if we enforced the spec's interpretation. In particular this whole business of "moving HAVING into WHERE" is wrong and should go away. Comments? Can anyone confirm whether DB2 or other databases allow ungrouped column references with HAVING? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On Wed, 2005-03-09 at 21:21 -0500, Tom Lane wrote: > Comments? Can anyone confirm whether DB2 or other databases allow > ungrouped column references with HAVING? In Sybase: 1> select 2 as id, max(myfield) from mytable where 2=1 2> go id ----------- ---------- 2 NULL (1 row affected) 1> select 2 as id, max(myfield) from mytable having 2=1 2> go id ----------- ---------- (0 rows affected) -- Mark Shewmaker mark@primefactor.com ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| I wrote: > This is quite clear that the output of a HAVING clause is a "grouped > table" no matter whether the query uses GROUP BY or aggregates or not. > What that means is that neither the HAVING clause nor the targetlist > can use any ungrouped columns except within aggregate calls; that is, > select col from tab having 2>1 > is in fact illegal per SQL spec, because col isn't a grouping column > (there are no grouping columns in this query). Actually, it's even more than that: a query with HAVING and no GROUP BY should always return 1 row (if the HAVING succeeds) or 0 rows (if not). If there are no aggregates, the entire from/where clause can be thrown away, because it can have no impact on the result! Would those of you with access to other DBMSes try this: create table tab (col integer); select 1 from tab having 1=0; select 1 from tab having 1=1; insert into tab values(1); insert into tab values(2); select 1 from tab having 1=0; select 1 from tab having 1=1; I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows from the 4 selects --- that is, the contents of tab make no difference at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely copying our mistake...) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Tom Lane wrote: > > Would those of you with access to other DBMSes try this: > > create table tab (col integer); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > insert into tab values(1); > insert into tab values(2); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > > I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows MS SQL Server 2000 returns 0, 1, 0 and 1 rows correctly. Cheers, Gary. ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| Tom Lane wrote: > > Would those of you with access to other DBMSes try this: > > create table tab (col integer); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > insert into tab values(1); > insert into tab values(2); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > > I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows Not that this means much, but I'll mention it for the sake of completeness: SQLite 3.0.8 disallows all of the above SELECT statements: sqlite> create table tab (col integer); sqlite> select 1 from tab having 1=0; SQL error: a GROUP BY clause is required before HAVING sqlite> select 1 from tab having 1=1; SQL error: a GROUP BY clause is required before HAVING sqlite> insert into tab values(1); sqlite> insert into tab values(2); sqlite> select 1 from tab having 1=0; SQL error: a GROUP BY clause is required before HAVING sqlite> select 1 from tab having 1=1; SQL error: a GROUP BY clause is required before HAVING -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| On Thu, 10 Mar 2005 12:44:50 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Would those of you with access to other DBMSes try this: > On informix 9.21.UC4 > create table tab (col integer); > select 1 from tab having 1=0; > returns no rows > select 1 from tab having 1=1; > returns no rows > insert into tab values(1); > insert into tab values(2); > select 1 from tab having 1=0; > returns no rows > select 1 from tab having 1=1; > returns 2 rows regards, Jaime Casanova ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Thu, Mar 10, 2005 at 12:44:50PM -0500, Tom Lane wrote: > > Would those of you with access to other DBMSes try this: > > create table tab (col integer); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > insert into tab values(1); > insert into tab values(2); > select 1 from tab having 1=0; > select 1 from tab having 1=1; > > I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows > from the 4 selects --- that is, the contents of tab make no difference > at all. Sybase ASE version 12.5.2 returns 0, 0, 0, and 1 rows. A plain "select 1 from tab" returns zero rows when tab is empty. -- Mark Shewmaker mark@primefactor.com ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| ||||
| "Peter Wright" <pete@flooble.net> writes: > I think this demonstrates the problem much better than I could explain in > words. The bug is shown in the two > SELECT queries with a WHERE clause. Very bizarre. I've applied a patch that corrects this problem in CVS HEAD, but since it changes the behavior of HAVING in a nontrivial way, I'm inclined to think that we should not backpatch it into existing release branches. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| Thread Tools | |
| Display Modes | |
|
|