Unix Technical Forum

Re: Index ignored with "is not distinct from", 8.2 beta2

This is a discussion on Re: Index ignored with "is not distinct from", 8.2 beta2 within the pgsql Hackers forums, part of the PostgreSQL category; --> One issue is that I'm not sure think you've got your sugar quite right. Have you tested with: (col ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 05:33 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

One issue is that I'm not sure think you've got your sugar quite right.
Have you tested with:

(col IS NOT NULL AND 123 IS NOT NULL AND col = 123) OR
(col IS NULL and 123 IS NULL) ?

It's possible that the planner doesn't know about using an index for
DISTINCT; or it might just want an index that's defined WHERE col IS NOT
NULL.

On Wed, Nov 01, 2006 at 03:29:33PM -0500, JEAN-PIERRE PELLETIER wrote:
> I've reposted this from pgsql-performance where I got no response.
>
> ==========================================
>
> Hi,
>
> I wanted to use "exp1 is not distinct from exp2" which I tough was syntaxic
> sugar for
> exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and
> exp2 is null
> but my index is ignored with "is not distinct from".
>
> Is this the expected behavior ?
>
> create temporary table t as select * from generate_series(1,1000000) t(col);
> create unique index i on t(col);
> analyze t;
>
> -- These queries don't use the index
> select count(*) from t where col is not distinct from 123;
> select count(*) from t where not col is distinct from 123;
>
> -- This query use the index
> select count(*) from t where col is not null and 123 is not null and col =
> 123 or col is null and 123 is null;
>
> explain analyze select count(*) from t where col is not distinct from 123;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual
> time=228.200..228.202 rows=1 loops=1)
> -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual
> time=0.042..228.133 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 228.290 ms
> (4 rows)
> Time: 219.000 ms
>
> explain analyze select count(*) from t where not col is distinct from 123;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual
> time=235.950..235.952 rows=1 loops=1)
> -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual
> time=0.040..235.909 rows=1 loops=1)
> Filter: (NOT (col IS DISTINCT FROM 123))
> Total runtime: 236.065 ms
> (4 rows)
> Time: 250.000 ms
>
> explain analyze select count(*) from t where col is not null and 123 is not
> null and col = 123 or col is null and 123 is null;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------
> Aggregate (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268
> rows=1 loops=1)
> -> Index Scan using i on t (cost=0.00..8.13 rows=1 width=0) (actual
> time=0.237..0.241 rows=1 loops=1)
> Index Cond: (col = 123)
> Total runtime: 0.366 ms
> (4 rows)
> Time: 0.000 ms
>
> I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2
>
> Thanks,
> Jean-Pierre Pelletier
> e-djuster
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>


--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 05:33 AM
JEAN-PIERRE PELLETIER
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

This shows all three forms to be equivalent.

SELECT
exp1,
exp2,
exp1 IS NOT DISTINCT FROM exp2 AS isnotdistinct,
exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and
exp2 is null AS JP,
(exp1 is not null and exp2 is not null and exp1 = exp2) or (exp1 is null
and exp2 is null) AS Jim
FROM
(SELECT 1 AS exp1, 1 AS exp2
UNION ALL SELECT 1, 2
UNION ALL SELECT 1,NULL
UNION ALL SELECT NULL,1
UNION ALL SELECT NULL,NULL) Q;

I understand that the planner doesn't use indexes for IS NOT DISTINCT FROM,
but it would
be good because "is not distinct from" is very useful when you have nulls
but don't want to use three value logic.

null = null => true
null = not null => false

I don't think it is that uncommon and even some SQL constructs such as
"select distinct" or "group by" compare null that way.

I'll wait before using IS NOT DISTINCT FROM and stick with the equivalent
longer forms which
use indexes.

8.2 is better than 8.1 which was not picking up indexes even with the
longer forms.

Jean-Pierre Pelletier

>From: "Jim C. Nasby" <jim@nasby.net>
>To: JEAN-PIERRE PELLETIER <pelletier_32@sympatico.ca>
>CC: pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2
>Date: Mon, 6 Nov 2006 16:02:40 -0600
>
>One issue is that I'm not sure think you've got your sugar quite right.
>Have you tested with:
>
>(col IS NOT NULL AND 123 IS NOT NULL AND col = 123) OR
> (col IS NULL and 123 IS NULL) ?
>
>It's possible that the planner doesn't know about using an index for
>DISTINCT; or it might just want an index that's defined WHERE col IS NOT
>NULL.
>
>On Wed, Nov 01, 2006 at 03:29:33PM -0500, JEAN-PIERRE PELLETIER wrote:
> > I've reposted this from pgsql-performance where I got no response.
> >
> > ==========================================
> >
> > Hi,
> >
> > I wanted to use "exp1 is not distinct from exp2" which I tough was

>syntaxic
> > sugar for
> > exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null

>and
> > exp2 is null
> > but my index is ignored with "is not distinct from".
> >
> > Is this the expected behavior ?
> >
> > create temporary table t as select * from generate_series(1,1000000)

>t(col);
> > create unique index i on t(col);
> > analyze t;
> >
> > -- These queries don't use the index
> > select count(*) from t where col is not distinct from 123;
> > select count(*) from t where not col is distinct from 123;
> >
> > -- This query use the index
> > select count(*) from t where col is not null and 123 is not null and col

>=
> > 123 or col is null and 123 is null;
> >
> > explain analyze select count(*) from t where col is not distinct from

>123;
> > QUERY PLAN
> >

>------------------------------------------------------------------------------------------------------------
> > Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual
> > time=228.200..228.202 rows=1 loops=1)
> > -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual
> > time=0.042..228.133 rows=1 loops=1)
> > Filter: (NOT (col IS DISTINCT FROM 123))
> > Total runtime: 228.290 ms
> > (4 rows)
> > Time: 219.000 ms
> >
> > explain analyze select count(*) from t where not col is distinct from

>123;
> > QUERY PLAN
> >

>------------------------------------------------------------------------------------------------------------
> > Aggregate (cost=19154.79..19154.80 rows=1 width=0) (actual
> > time=235.950..235.952 rows=1 loops=1)
> > -> Seq Scan on t (cost=0.00..17904.90 rows=499956 width=0) (actual
> > time=0.040..235.909 rows=1 loops=1)
> > Filter: (NOT (col IS DISTINCT FROM 123))
> > Total runtime: 236.065 ms
> > (4 rows)
> > Time: 250.000 ms
> >
> > explain analyze select count(*) from t where col is not null and 123 is

>not
> > null and col = 123 or col is null and 123 is null;
> > QUERY PLAN
> >

>-----------------------------------------------------------------------------------------------------------
> > Aggregate (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268
> > rows=1 loops=1)
> > -> Index Scan using i on t (cost=0.00..8.13 rows=1 width=0) (actual
> > time=0.237..0.241 rows=1 loops=1)
> > Index Cond: (col = 123)
> > Total runtime: 0.366 ms
> > (4 rows)
> > Time: 0.000 ms
> >
> > I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2
> >
> > Thanks,
> > Jean-Pierre Pelletier
> > e-djuster
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >

>
>--
>Jim Nasby jim@nasby.net
>EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 05:33 AM
Martijn van Oosterhout
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

On Mon, Nov 06, 2006 at 09:10:40PM -0500, JEAN-PIERRE PELLETIER wrote:
> I understand that the planner doesn't use indexes for IS NOT DISTINCT FROM,
> but it would
> be good because "is not distinct from" is very useful when you have nulls
> but don't want to use three value logic.


The main issue is that currently indexes cannot be used to find NULLs
in a table. Patches have been created that cover most index types, but
it's not part of the main distribution.

Partial indexes can be a solution to the "x IS NULL" clauses.

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFUFn+IB7bNG8LQkwRAjNfAJ4hqxATOGkap0gcOTsKlC r3durTMwCfSuOu
N3v9S4e/8AYjAQYpgCTnuAs=
=ecl2
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 05:33 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

On Tue, Nov 07, 2006 at 11:03:42AM +0100, Martijn van Oosterhout wrote:
> On Mon, Nov 06, 2006 at 09:10:40PM -0500, JEAN-PIERRE PELLETIER wrote:
> > I understand that the planner doesn't use indexes for IS NOT DISTINCT FROM,
> > but it would
> > be good because "is not distinct from" is very useful when you have nulls
> > but don't want to use three value logic.

>
> The main issue is that currently indexes cannot be used to find NULLs
> in a table. Patches have been created that cover most index types, but
> it's not part of the main distribution.


I assume you're referring to
http://archives.postgresql.org/pgsql...9/msg00083.php ?

I'm curious as to the status of that patch... presumably it never made
it into the queue, but I'm not sure why since you seemed to address
Tom's concerns (other than not indexing IS NOT NULL, which I'm not sure
is all that useful...)
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-12-2008, 05:33 AM
Martijn van Oosterhout
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

On Tue, Nov 07, 2006 at 02:12:29PM -0600, Jim C. Nasby wrote:
> I assume you're referring to
> http://archives.postgresql.org/pgsql...9/msg00083.php ?
>
> I'm curious as to the status of that patch... presumably it never made
> it into the queue, but I'm not sure why since you seemed to address
> Tom's concerns (other than not indexing IS NOT NULL, which I'm not sure
> is all that useful...)


There's been work on it. Theodor cleaned it up for HEAD and looked at
adding GiST support. I beleive he's waiting for 8.2 to release.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFUQtdIB7bNG8LQkwRAvgrAJ4teqqHpoNdmsPHo2g1kg 3b5e9ffwCeNqCD
T9MohvHdXKckAUBySEtudWI=
=2deM
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-12-2008, 05:33 AM
JEAN-PIERRE PELLETIER
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

I can see that adding null to indexes would allow all cases of "is not
distinct from"
to use them.

The lack of null in indexes would explain why a condition such as "col is
not distinct from null"
would not pick up an index.

But my example was: "col is not distinct from 123"
and the equivalent longer form has no problem picking up the index

I could restate the problem as: Why can't the planner handle
"col is not distinct from 123" as well as
"col is not null and 123 is not null and col = 123 or col is null and 123 is
null"

Jean-Pierre Pelletier

>From: Martijn van Oosterhout <kleptog@svana.org>
>Reply-To: Martijn van Oosterhout <kleptog@svana.org>
>To: JEAN-PIERRE PELLETIER <pelletier_32@sympatico.ca>
>CC: jim@nasby.net, pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Index ignored with "is not distinct from", 8.2 beta2
>Date: Tue, 7 Nov 2006 11:03:42 +0100
>
>On Mon, Nov 06, 2006 at 09:10:40PM -0500, JEAN-PIERRE PELLETIER wrote:
> > I understand that the planner doesn't use indexes for IS NOT DISTINCT

>FROM,
> > but it would
> > be good because "is not distinct from" is very useful when you have

>nulls
> > but don't want to use three value logic.

>
>The main issue is that currently indexes cannot be used to find NULLs
>in a table. Patches have been created that cover most index types, but
>it's not part of the main distribution.
>
>Partial indexes can be a solution to the "x IS NULL" clauses.
>
>Hope this helps,
>--
>Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to

>litigate.



><< signature.asc >>




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-12-2008, 05:33 AM
Teodor Sigaev
 
Posts: n/a
Default Re: Index ignored with "is not distinct from", 8.2 beta2

> There's been work on it. Theodor cleaned it up for HEAD and looked at
> adding GiST support. I beleive he's waiting for 8.2 to release.


Yep, I have bundle of patches and I'm waiting for 8.2 branch split out of HEAD.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:08 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com