This is a discussion on constraints in query plans within the pgsql Hackers forums, part of the PostgreSQL category; --> I set up the following experiment: CREATE DOMAIN m_or_p AS "char" CHECK (VALUE = 'm' OR VALUE = 'p'); ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I set up the following experiment: CREATE DOMAIN m_or_p AS "char" CHECK (VALUE = 'm' OR VALUE = 'p'); CREATE TABLE test_domain ( fkey integer not null, k integer not null, x1 integer not null, x2 integer, mp m_or_p not null ); CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm'; CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p'; then added about 375000 rows, half where mp = 'm' and half where mp = 'p' Now, I do analyze verbose test_domain jeremyd=# ANALYZE verbose test_domain; INFO: analyzing "public.test_domain" INFO: "test_domain": scanned 2379 of 2379 pages, containing 375226 live rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows ANALYZE Now, take a look at this jeremyd=# explain select * from test_domain where k = 1255; QUERY PLAN ----------------------------------------------------------------- Seq Scan on test_domain (cost=0.00..7069.32 rows=127 width=17) Filter: (k = 1255) (2 rows) I turn constraint_exclusion on and I still get the same plan. I tried adding the domain's constraint to the table as well ALTER TABLE test_domain ADD CHECK (mp = 'm' OR mp = 'p'); and I still get the same plan. It seems the constraint is not incorporated into the plan, since I get a different plan if I include the constraint in the WHERE clause explicitly: jeremyd=# explain select * from test_domain where k = 1255 AND (mp = 'm' OR mp = 'p'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on test_domain (cost=9.97..423.26 rows=95 width=17) Recheck Cond: (((k = 1255) AND ((mp)::"char" = 'm'::"char")) OR ((k = 1255) AND ((mp)::"char" = 'p'::"char"))) -> BitmapOr (cost=9.97..9.97 rows=127 width=0) -> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..4.98 rows=60 width=0) Index Cond: (k = 1255) -> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..4.98 rows=67 width=0) Index Cond: (k = 1255) (7 rows) And the explain analyze for each: jeremyd=# explain analyze select * from test_domain where k = 1255 AND (mp = 'm' OR mp = 'p'); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on test_domain (cost=9.97..423.26 rows=95 width=17) (actual time=0.325..2.397 rows=261 loops=1) Recheck Cond: (((k = 1255) AND ((mp)::"char" = 'm'::"char")) OR ((k = 1255) AND ((mp)::"char" = 'p'::"char"))) -> BitmapOr (cost=9.97..9.97 rows=127 width=0) (actual time=0.269..0.269 rows=0 loops=1) -> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..4.98 rows=60 width=0) (actual time=0.150..0.150 rows=129 loops=1) Index Cond: (k = 1255) -> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..4.98 rows=67 width=0) (actual time=0.101..0.101 rows=132 loops=1) Index Cond: (k = 1255) Total runtime: 3.238 ms (8 rows) jeremyd=# explain analyze select * from test_domain where k = 1255; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on test_domain (cost=0.00..7069.32 rows=127 width=17) (actual time=0.427..125.057 rows=261 loops=1) Filter: (k = 1255) Total runtime: 125.878 ms (3 rows) ISTM that with the constraint_exclusion flag on, it should see from the constraints that all values but 'm' or 'p' are excluded for the column mp, and thus the two queries I gave are exactly equivalent. I noticed that the docs said it looked at table constraints, so I added the check to the table constraint as well, but it made no difference. I'm not sure if this is a bug or a limitation of the planner, but it seems that these two queries are equivalent. I wonder how it would work out with boolean instead of the "char" column, it should definitely know that there are only 2 possible values for a boolean not null column, true or false. DROP INDEX test_domain_k_x1_x2_p; DROP INDEX test_domain_k_x1_x2_m; ALTER TABLE test_domain DROP CONSTRAINT test_domain_mp_check; ALTER TABLE test_domain ALTER COLUMN mp TYPE boolean USING (CASE WHEN mp = 'm' THEN false ELSE true END); CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp; CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE NOT mp; jeremyd=# ANALYZE verbose test_domain; INFO: analyzing "public.test_domain" INFO: "test_domain": scanned 2379 of 2379 pages, containing 375226 live rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows ANALYZE jeremyd=# explain analyze select * from test_domain where k = 1255; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Seq Scan on test_domain (cost=0.00..7069.32 rows=131 width=17) (actual time=0.317..103.822 rows=261 loops=1) Filter: (k = 1255) Total runtime: 104.631 ms (3 rows) jeremyd=# explain analyze select * from test_domain where k = 1255 AND (mp OR NOT mp); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on test_domain (cost=10.01..434.58 rows=98 width=17) (actual time=0.162..1.132 rows=261 loops=1) Recheck Cond: (((k = 1255) AND mp) OR ((k = 1255) AND (NOT mp))) -> BitmapOr (cost=10.01..10.01 rows=131 width=0) (actual time=0.129..0.129 rows=0 loops=1) -> Bitmap Index Scan on test_domain_k_x1_x2_p (cost=0.00..5.01 rows=66 width=0) (actual time=0.077..0.077 rows=132 loops=1) Index Cond: (k = 1255) -> Bitmap Index Scan on test_domain_k_x1_x2_m (cost=0.00..5.01 rows=65 width=0) (actual time=0.042..0.042 rows=129 loops=1) Index Cond: (k = 1255) Total runtime: 1.947 ms (8 rows) Now this one looks even more strange, that adding a no-op like 'AND (mp OR NOT mp)' to the query gives a completely different plan. I'm thinking I should have named the column to_be -- Truth is the most valuable thing we have -- so let us economize it. -- Mark Twain ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Jeremy Drake <pgsql@jdrake.com> writes: > CREATE TABLE test_domain ( > fkey integer not null, > k integer not null, > x1 integer not null, > x2 integer, > mp m_or_p not null > ); > CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm'; > CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p'; Perhaps you need a non-partial index. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| On Sun, 15 Oct 2006, Tom Lane wrote: > Jeremy Drake <pgsql@jdrake.com> writes: > > CREATE TABLE test_domain ( > > fkey integer not null, > > k integer not null, > > x1 integer not null, > > x2 integer, > > mp m_or_p not null > > ); > > > CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm'; > > CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p'; > > Perhaps you need a non-partial index. I just tried that, CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp); and dropped the others. That actually works properly. jeremyd=# explain analyze select * from test_domain where k = 1255 and mp; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on test_domain (cost=5.37..237.21 rows=66 width=17) (actual time=0.115..0.707 rows=132 loops=1) Recheck Cond: (k = 1255) Filter: mp -> Bitmap Index Scan on test_domain_k_x1_x2_mp (cost=0.00..5.37 rows=66 width=0) (actual time=0.081..0.081 rows=132 loops=1) Index Cond: ((k = 1255) AND (mp = true)) Total runtime: 1.137 ms (6 rows) I thought I had to refer to all of the columns in order for this to work, that I could not skip some in the middle, but it seems to work. -- If you can survive death, you can probably survive anything. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| ||||
| On Sun, 2006-10-15 at 20:36 -0700, Jeremy Drake wrote: > I just tried that, > CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp); > > and dropped the others. That actually works properly. > jeremyd=# explain analyze select * from test_domain where k = 1255 and mp; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on test_domain (cost=5.37..237.21 rows=66 width=17) (actual time=0.115..0.707 rows=132 loops=1) > Recheck Cond: (k = 1255) > Filter: mp > -> Bitmap Index Scan on test_domain_k_x1_x2_mp (cost=0.00..5.37 rows=66 width=0) (actual time=0.081..0.081 rows=132 loops=1) > Index Cond: ((k = 1255) AND (mp = true)) > Total runtime: 1.137 ms > (6 rows) > > I thought I had to refer to all of the columns in order for this to work, > that I could not skip some in the middle, but it seems to work. > As long as "k=1255" is selective enough, the index is useful. That's because k is the first item in the index key. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |