Unix Technical Forum

constraints in query plans

This is a discussion on constraints in query plans within the pgsql Hackers forums, part of the PostgreSQL category; --> I set up the following experiment: CREATE DOMAIN m_or_p AS "char" CHECK (VALUE = 'm' OR VALUE = 'p'); ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Hackers

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-12-2008, 05:20 AM
Jeremy Drake
 
Posts: n/a
Default constraints in query plans

I set up the following experiment:

CREATE DOMAIN m_or_p AS "char" CHECK (VALUE = 'm' OR VALUE = 'p');

CREATE TABLE test_domain (
fkey integer not null,
k integer not null,
x1 integer not null,
x2 integer,
mp m_or_p not null
);

CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm';
CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p';

then added about 375000 rows, half where mp = 'm' and half where mp = 'p'

Now, I do analyze verbose test_domain
jeremyd=# ANALYZE verbose test_domain;
INFO: analyzing "public.test_domain"
INFO: "test_domain": scanned 2379 of 2379 pages, containing 375226 live
rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows
ANALYZE


Now, take a look at this
jeremyd=# explain select * from test_domain where k = 1255;
QUERY PLAN
-----------------------------------------------------------------
Seq Scan on test_domain (cost=0.00..7069.32 rows=127 width=17)
Filter: (k = 1255)
(2 rows)

I turn constraint_exclusion on and I still get the same plan. I tried
adding the domain's constraint to the table as well

ALTER TABLE test_domain ADD CHECK (mp = 'm' OR mp = 'p');

and I still get the same plan. It seems the constraint is not
incorporated into the plan, since I get a different plan if I include the
constraint in the WHERE clause explicitly:

jeremyd=# explain select * from test_domain where k = 1255 AND (mp = 'm' OR mp = 'p');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_domain (cost=9.97..423.26 rows=95 width=17)
Recheck Cond: (((k = 1255) AND ((mp)::"char" = 'm'::"char")) OR ((k = 1255) AND ((mp)::"char" = 'p'::"char")))
-> BitmapOr (cost=9.97..9.97 rows=127 width=0)
-> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..4.98 rows=60 width=0)
Index Cond: (k = 1255)
-> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..4.98 rows=67 width=0)
Index Cond: (k = 1255)
(7 rows)




And the explain analyze for each:

jeremyd=# explain analyze select * from test_domain where k = 1255 AND (mp = 'm' OR mp = 'p');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_domain (cost=9.97..423.26 rows=95 width=17) (actual time=0.325..2.397 rows=261 loops=1)
Recheck Cond: (((k = 1255) AND ((mp)::"char" = 'm'::"char")) OR ((k = 1255) AND ((mp)::"char" = 'p'::"char")))
-> BitmapOr (cost=9.97..9.97 rows=127 width=0) (actual time=0.269..0.269 rows=0 loops=1)
-> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..4.98 rows=60 width=0) (actual time=0.150..0.150 rows=129 loops=1)
Index Cond: (k = 1255)
-> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..4.98 rows=67 width=0) (actual time=0.101..0.101 rows=132 loops=1)
Index Cond: (k = 1255)
Total runtime: 3.238 ms
(8 rows)

jeremyd=# explain analyze select * from test_domain where k = 1255;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on test_domain (cost=0.00..7069.32 rows=127 width=17) (actual time=0.427..125.057 rows=261 loops=1)
Filter: (k = 1255)
Total runtime: 125.878 ms
(3 rows)

ISTM that with the constraint_exclusion flag on, it should see from the
constraints that all values but 'm' or 'p' are excluded for the column mp,
and thus the two queries I gave are exactly equivalent. I noticed that
the docs said it looked at table constraints, so I added the check to the
table constraint as well, but it made no difference. I'm not sure if this
is a bug or a limitation of the planner, but it seems that these two
queries are equivalent. I wonder how it would work out with boolean
instead of the "char" column, it should definitely know that there are
only 2 possible values for a boolean not null column, true or false.

DROP INDEX test_domain_k_x1_x2_p;
DROP INDEX test_domain_k_x1_x2_m;
ALTER TABLE test_domain DROP CONSTRAINT test_domain_mp_check;
ALTER TABLE test_domain ALTER COLUMN mp TYPE boolean USING (CASE WHEN mp = 'm' THEN false ELSE true END);
CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp;
CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE NOT mp;

jeremyd=# ANALYZE verbose test_domain;
INFO: analyzing "public.test_domain"
INFO: "test_domain": scanned 2379 of 2379 pages, containing 375226 live
rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows
ANALYZE

jeremyd=# explain analyze select * from test_domain where k = 1255;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on test_domain (cost=0.00..7069.32 rows=131 width=17) (actual time=0.317..103.822 rows=261 loops=1)
Filter: (k = 1255)
Total runtime: 104.631 ms
(3 rows)

jeremyd=# explain analyze select * from test_domain where k = 1255 AND (mp OR NOT mp);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_domain (cost=10.01..434.58 rows=98 width=17) (actual time=0.162..1.132 rows=261 loops=1)
Recheck Cond: (((k = 1255) AND mp) OR ((k = 1255) AND (NOT mp)))
-> BitmapOr (cost=10.01..10.01 rows=131 width=0) (actual time=0.129..0.129 rows=0 loops=1)
-> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..5.01 rows=66 width=0) (actual time=0.077..0.077 rows=132 loops=1)
Index Cond: (k = 1255)
-> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..5.01 rows=65 width=0) (actual time=0.042..0.042 rows=129 loops=1)
Index Cond: (k = 1255)
Total runtime: 1.947 ms
(8 rows)


Now this one looks even more strange, that adding a no-op like 'AND (mp OR
NOT mp)' to the query gives a completely different plan. I'm thinking I
should have named the column to_be

--
Truth is the most valuable thing we have -- so let us economize it.
-- Mark Twain

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-12-2008, 05:20 AM
Tom Lane
 
Posts: n/a
Default Re: constraints in query plans

Jeremy Drake <pgsql@jdrake.com> writes:
> CREATE TABLE test_domain (
> fkey integer not null,
> k integer not null,
> x1 integer not null,
> x2 integer,
> mp m_or_p not null
> );


> CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm';
> CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p';


Perhaps you need a non-partial index.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-12-2008, 05:20 AM
Jeremy Drake
 
Posts: n/a
Default Re: constraints in query plans

On Sun, 15 Oct 2006, Tom Lane wrote:

> Jeremy Drake <pgsql@jdrake.com> writes:
> > CREATE TABLE test_domain (
> > fkey integer not null,
> > k integer not null,
> > x1 integer not null,
> > x2 integer,
> > mp m_or_p not null
> > );

>
> > CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm';
> > CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p';

>
> Perhaps you need a non-partial index.


I just tried that,
CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp);

and dropped the others. That actually works properly.
jeremyd=# explain analyze select * from test_domain where k = 1255 and mp;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_domain (cost=5.37..237.21 rows=66 width=17) (actual time=0.115..0.707 rows=132 loops=1)
Recheck Cond: (k = 1255)
Filter: mp
-> Bitmap Index Scan on test_domain_k_x1_x2_mp (cost=0.00..5.37 rows=66 width=0) (actual time=0.081..0.081 rows=132 loops=1)
Index Cond: ((k = 1255) AND (mp = true))
Total runtime: 1.137 ms
(6 rows)

I thought I had to refer to all of the columns in order for this to work,
that I could not skip some in the middle, but it seems to work.


--
If you can survive death, you can probably survive anything.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-12-2008, 05:21 AM
Jeff Davis
 
Posts: n/a
Default Re: constraints in query plans

On Sun, 2006-10-15 at 20:36 -0700, Jeremy Drake wrote:
> I just tried that,
> CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp);
>
> and dropped the others. That actually works properly.
> jeremyd=# explain analyze select * from test_domain where k = 1255 and mp;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on test_domain (cost=5.37..237.21 rows=66 width=17) (actual time=0.115..0.707 rows=132 loops=1)
> Recheck Cond: (k = 1255)
> Filter: mp
> -> Bitmap Index Scan on test_domain_k_x1_x2_mp (cost=0.00..5.37 rows=66 width=0) (actual time=0.081..0.081 rows=132 loops=1)
> Index Cond: ((k = 1255) AND (mp = true))
> Total runtime: 1.137 ms
> (6 rows)
>
> I thought I had to refer to all of the columns in order for this to work,
> that I could not skip some in the middle, but it seems to work.
>


As long as "k=1255" is selective enough, the index is useful. That's
because k is the first item in the index key.

Regards,
Jeff Davis


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:48 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com