This is a discussion on BUG #2334: WHERE IN (SUBSELECT) fails when column is null within the pgsql Bugs forums, part of the PostgreSQL category; --> The following bug has been logged online: Bug reference: 2334 Logged by: Patrick Narkinsky Email address: patrick@narkinsky.com PostgreSQL version: ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| The following bug has been logged online: Bug reference: 2334 Logged by: Patrick Narkinsky Email address: patrick@narkinsky.com PostgreSQL version: 8.1.3 Operating system: Mac OS X Description: WHERE IN (SUBSELECT) fails when column is null Details: This may be expected behavior, but it certainly doesn't seem right to me, and it works as expected in sqlite. The database is as follows: BEGIN TRANSACTION; create table a ( id integer, text varchar(20) ); INSERT INTO a VALUES(0,'test'); INSERT INTO a VALUES(1,'test2'); create table b ( id integer, a_id integer); INSERT INTO b VALUES(0,NULL); INSERT INTO b VALUES(1,NULL); INSERT INTO b VALUES(2,NULL); COMMIT; The following query returns everything in a in sqlite, but returns nothing in postgresql: select * from a where a.id not in (select a_id from b); On postgresql, it works as expected when a_id has a non-null value. I'm not expert enough on SQL to say which is wrong, but it appears to me that the SQLite behavior makes a lot more sense. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Fri, 17 Mar 2006, Patrick Narkinsky wrote: > This may be expected behavior, but it certainly doesn't seem right to me, > and it works as expected in sqlite. > > The database is as follows: > > BEGIN TRANSACTION; > create table a ( > id integer, > text varchar(20) > ); > INSERT INTO a VALUES(0,'test'); > INSERT INTO a VALUES(1,'test2'); > create table b ( > id integer, > a_id integer); > INSERT INTO b VALUES(0,NULL); > INSERT INTO b VALUES(1,NULL); > INSERT INTO b VALUES(2,NULL); > COMMIT; > > The following query returns everything in a in sqlite, but returns nothing > in postgresql: > > select * from a where a.id not in (select a_id from b); AFAICS, our behavior follows SQL. a NOT IN b is NOT(a IN b) IN is defined in terms of = ANY. a =ANY (b) is basically (by my reading of 8.8 anyway): True if a = bi for some bi in b False if b is empty or a <> bi for all bi in b Unknown otherwise Since a <> NULL returns unknown, the second one won't come up, so the whole expression won't ever be true after the negation. It might be false or it might be unknown. ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Stephan Szabo schrieb: > AFAICS, our behavior follows SQL. > > a NOT IN b is NOT(a IN b) > IN is defined in terms of = ANY. > a =ANY (b) is basically (by my reading of 8.8 anyway): > True if a = bi for some bi in b > False if b is empty or a <> bi for all bi in b > Unknown otherwise > Since a <> NULL returns unknown, the second one won't come up, so the > whole expression won't ever be true after the negation. It might be false > or it might be unknown. > Not having read 8.8, I encountered this today and found it odd as well. It would mean that the old popular optimization, back when "A IN B" was much slower, was not correct: select * from foo where a not in (select b from bar) used to be written as: select * from foo where not exists (select 1 from bar where a=b) These queries have different results now when b is NULL for some rows. It doesn't look right to me (but if the Standard requires it, what can we do...). Regards, Marinos ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| On Wed, 22 Mar 2006, Marinos Yannikos wrote: > Stephan Szabo schrieb: > > AFAICS, our behavior follows SQL. > > > > a NOT IN b is NOT(a IN b) > > IN is defined in terms of = ANY. > > a =ANY (b) is basically (by my reading of 8.8 anyway): > > True if a = bi for some bi in b > > False if b is empty or a <> bi for all bi in b > > Unknown otherwise > > Since a <> NULL returns unknown, the second one won't come up, so the > > whole expression won't ever be true after the negation. It might be false > > or it might be unknown. > > > > Not having read 8.8, I encountered this today and found it odd as well. > It would mean that the old popular optimization, back when "A IN B" was > much slower, was not correct: > > select * from foo where a not in (select b from bar) > > used to be written as: > > select * from foo where not exists (select 1 from bar where a=b) Yep, in->exists I believe is the same, but not in->not exists is different. Exists and subqueries should probably have been done differently in SQL, but alas. > These queries have different results now when b is NULL for some rows. > It doesn't look right to me (but if the Standard requires it, what can > we do...). It actually makes some sense if you think about null as an unknown value. If you ask is 1 in the set (1, 2, unknown), you can definately say yes. If you ask is 3 in the set (1, 2, unknown), you can't be sure, because that unknown might be 3. For any x that's of the correct type for the set, you'll never be able to say no due to that unknown. If you ask is 1 not in the set (1, 2, unknown) you can definately say no. If you ask is 3 not in the set (1, 2, unknown) you again can't be sure. For any x that's of the correct type for the set, you'll never be able to say yes due to that unknown. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |