This is a discussion on Why this Query Plan is different within the pgsql Admins forums, part of the PostgreSQL category; --> Dear All, I have two PostgreSQL Database Servers running on Gentoo which versions are PostgreSQL 7.4.5 (Box A) and ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Dear All, I have two PostgreSQL Database Servers running on Gentoo which versions are PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B) Due to a recent query posted by a developer I was interested to know which factors are affecting these two servers to behave in a different manner to the same query. Please shed light on this. Box A (PostgreSQL 7.4.5) - EXPLAIN SELECT count(airport_code) FROM properties WHERE airport_code::character varying IN (SELECT airport_code::character varying FROM airport_codes); QUERY PLAN ---------------------------------------------------------------------------- Aggregate (cost=3605245.81..3605245.81 rows=1 width=4) -> Seq Scan on properties (cost=0.00..3605195.64 rows=20068 width=4) Filter: (subplan) SubPlan -> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654 width=7) Box B (PostgreSQL 7.3.9) - This take much to process production=# EXPLAIN SELECT count(airport_code) FROM properties WHERE airport_code::character varying IN (SELECT airport_code::character varying FROM airport_codes); QUERY PLAN ---------------------------------------------------------------------------- Aggregate (cost=9421.05..9421.05 rows=1 width=13) -> Hash IN Join (cost=228.67..9320.65 rows=40158 width=13) Hash Cond: ((("outer".airport_code)::character varying)::text = (("inner".airport_code)::character varying)::text) -> Seq Scan on properties (cost=0.00..2137.58 rows=40158 width=13) -> Hash (cost=179.54..179.54 rows=7654 width=7) -> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654 width=7) Thilina Gunasekara Database Administrator ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| "Thilina Gunasekara" <tgunasekara@openworld.org> writes: > I have two PostgreSQL Database Servers running on Gentoo which versions are > PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B) > Due to a recent query posted by a developer I was interested to know which > factors are affecting these two servers to behave in a different manner to > the same query. Please shed light on this. 7.4 knows a great deal more about optimizing IN queries than 7.3 does. I think you have misidentified the two query plans, because "Hash IN Join" is not a plan type that 7.3 has at all. What is not clear to me is which version is really slower. Could we see EXPLAIN ANALYZE output (correctly labeled ;-)) not just EXPLAIN? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| Hi Tom, Begging your pardon for mislabeling the query plan output, please find the correctly labeled output. **** This is Box A which runs on 7.4.5 and delivers very quick result. production=# SELECT version(); version ---------------------------------------------------------------------------- ----------------------------------------------------- PostgreSQL 7.4.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6) (1 row) production=# EXPLAIN ANALYSE SELECT count(airport_code) FROM properties WHERE airport_code::character varying IN (SELECT airport_code::character varying FROM airport_codes); QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------ Aggregate (cost=7494.95..7494.95 rows=1 width=4) (actual time=285.821..285.823 rows=1 loops=1) -> Hash IN Join (cost=228.67..7393.46 rows=40593 width=4) (actual time=24.994..279.818 rows=5833 loops=1) Hash Cond: (("outer".airport_code)::text = (("inner".airport_code)::character varying)::text) -> Seq Scan on properties (cost=0.00..2156.93 rows=40593 width=4) (actual time=0.035..83.152 rows=40593 loops=1) -> Hash (cost=179.54..179.54 rows=7654 width=7) (actual time=24.491..24.491 rows=0 loops=1) -> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654 width=7) (actual time=0.046..12.481 rows=7654 loops=1) Total runtime: 286.234 ms (7 rows) ****** And this is the Box B which runs on 7.3.9 and takes hours to deliver the result. Actually I didn't execute EXPLAIN ANALYSE on Box B, because when I tried, it went for more than 20-30 Minutes. production=> SELECT version(); version ---------------------------------------------------------------------------- ----------------------------------------------------------------- PostgreSQL 7.3.9 on i686-pc-linux-gnu, compiled by GCC i686-pc-linux-gnu-gcc (GCC) 3.3.5 (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1) (1 row) production=> EXPLAIN SELECT count(airport_code) FROM properties WHERE airport_code::character varying IN (SELECT airport_code::character varying FROM airport_codes); QUERY PLAN ---------------------------------------------------------------------------- ---- Aggregate (cost=3605245.81..3605245.81 rows=1 width=4) -> Seq Scan on properties (cost=0.00..3605195.64 rows=20068 width=4) Filter: (subplan) SubPlan -> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654 width=7) (5 rows) Regards, Thilina -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto Sent: 24 March 2005 11:59 To: tgunasekara@openworld.org Cc: PGSQL Admin Subject: Re: [ADMIN] Why this Query Plan is different "Thilina Gunasekara" <tgunasekara@openworld.org> writes: > I have two PostgreSQL Database Servers running on Gentoo which versions are > PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B) > Due to a recent query posted by a developer I was interested to know which > factors are affecting these two servers to behave in a different manner to > the same query. Please shed light on this. 7.4 knows a great deal more about optimizing IN queries than 7.3 does. I think you have misidentified the two query plans, because "Hash IN Join" is not a plan type that 7.3 has at all. What is not clear to me is which version is really slower. Could we see EXPLAIN ANALYZE output (correctly labeled ;-)) not just EXPLAIN? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| "Thilina Gunasekara" <tgunasekara@openworld.org> writes: > **** This is Box A which runs on 7.4.5 and delivers very quick result. > ... > ****** And this is the Box B which runs on 7.3.9 and takes hours to deliver > the result. Well, that's what I'd expect. There's a reason why we put a great deal of effort into fixing IN in 7.4 ;-) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |