Unix Technical Forum

Why this Query Plan is different

This is a discussion on Why this Query Plan is different within the pgsql Admins forums, part of the PostgreSQL category; --> Dear All, I have two PostgreSQL Database Servers running on Gentoo which versions are PostgreSQL 7.4.5 (Box A) and ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > pgsql Admins

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 01:35 AM
Thilina Gunasekara
 
Posts: n/a
Default Why this Query Plan is different

Dear All,

I have two PostgreSQL Database Servers running on Gentoo which versions are
PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B)

Due to a recent query posted by a developer I was interested to know which
factors are affecting these two servers to behave in a different manner to
the same query. Please shed light on this.

Box A (PostgreSQL 7.4.5) -

EXPLAIN SELECT count(airport_code) FROM properties WHERE
airport_code::character varying IN (SELECT airport_code::character varying
FROM airport_codes);
QUERY PLAN
----------------------------------------------------------------------------
Aggregate (cost=3605245.81..3605245.81 rows=1 width=4)
-> Seq Scan on properties (cost=0.00..3605195.64 rows=20068 width=4)
Filter: (subplan)
SubPlan
-> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654
width=7)

Box B (PostgreSQL 7.3.9) - This take much to process
production=# EXPLAIN SELECT count(airport_code) FROM properties WHERE
airport_code::character varying IN (SELECT airport_code::character varying
FROM airport_codes);
QUERY PLAN
----------------------------------------------------------------------------
Aggregate (cost=9421.05..9421.05 rows=1 width=13)
-> Hash IN Join (cost=228.67..9320.65 rows=40158 width=13)
Hash Cond: ((("outer".airport_code)::character varying)::text =
(("inner".airport_code)::character varying)::text)
-> Seq Scan on properties (cost=0.00..2137.58 rows=40158
width=13)
-> Hash (cost=179.54..179.54 rows=7654 width=7)
-> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654
width=7)

Thilina Gunasekara

Database Administrator


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 01:35 AM
Tom Lane
 
Posts: n/a
Default Re: Why this Query Plan is different

"Thilina Gunasekara" <tgunasekara@openworld.org> writes:
> I have two PostgreSQL Database Servers running on Gentoo which versions are
> PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B)
> Due to a recent query posted by a developer I was interested to know which
> factors are affecting these two servers to behave in a different manner to
> the same query. Please shed light on this.


7.4 knows a great deal more about optimizing IN queries than 7.3 does.
I think you have misidentified the two query plans, because "Hash IN
Join" is not a plan type that 7.3 has at all. What is not clear to me
is which version is really slower. Could we see EXPLAIN ANALYZE output
(correctly labeled ;-)) not just EXPLAIN?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-10-2008, 01:35 AM
Thilina Gunasekara
 
Posts: n/a
Default Re: Why this Query Plan is different

Hi Tom,

Begging your pardon for mislabeling the query plan output, please find the
correctly labeled output.

**** This is Box A which runs on 7.4.5 and delivers very quick result.

production=# SELECT version();
version

----------------------------------------------------------------------------
-----------------------------------------------------
PostgreSQL 7.4.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.4
20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)
(1 row)

production=# EXPLAIN ANALYSE SELECT count(airport_code) FROM properties
WHERE airport_code::character varying IN (SELECT airport_code::character
varying FROM airport_codes);
QUERY PLAN

----------------------------------------------------------------------------
------------------------------------------------------
Aggregate (cost=7494.95..7494.95 rows=1 width=4) (actual
time=285.821..285.823 rows=1 loops=1)
-> Hash IN Join (cost=228.67..7393.46 rows=40593 width=4) (actual
time=24.994..279.818 rows=5833 loops=1)
Hash Cond: (("outer".airport_code)::text =
(("inner".airport_code)::character varying)::text)
-> Seq Scan on properties (cost=0.00..2156.93 rows=40593 width=4)
(actual time=0.035..83.152 rows=40593 loops=1)
-> Hash (cost=179.54..179.54 rows=7654 width=7) (actual
time=24.491..24.491 rows=0 loops=1)
-> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654
width=7) (actual time=0.046..12.481 rows=7654 loops=1)
Total runtime: 286.234 ms
(7 rows)


****** And this is the Box B which runs on 7.3.9 and takes hours to deliver
the result. Actually I didn't execute EXPLAIN ANALYSE on Box B, because when
I tried, it went for more than 20-30 Minutes.

production=> SELECT version();
version

----------------------------------------------------------------------------
-----------------------------------------------------------------
PostgreSQL 7.3.9 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 3.3.5 (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3,
pie-8.7.7.1)
(1 row)

production=> EXPLAIN SELECT count(airport_code) FROM properties WHERE
airport_code::character varying IN (SELECT airport_code::character varying
FROM airport_codes);
QUERY PLAN
----------------------------------------------------------------------------
----
Aggregate (cost=3605245.81..3605245.81 rows=1 width=4)
-> Seq Scan on properties (cost=0.00..3605195.64 rows=20068 width=4)
Filter: (subplan)
SubPlan
-> Seq Scan on airport_codes (cost=0.00..179.54 rows=7654
width=7)
(5 rows)

Regards,

Thilina

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailtogsql-admin-owner@postgresql.org] On Behalf Of Tom Lane
Sent: 24 March 2005 11:59
To: tgunasekara@openworld.org
Cc: PGSQL Admin
Subject: Re: [ADMIN] Why this Query Plan is different

"Thilina Gunasekara" <tgunasekara@openworld.org> writes:
> I have two PostgreSQL Database Servers running on Gentoo which versions

are
> PostgreSQL 7.4.5 (Box A) and PostgreSQL 7.3.9 (Box B)
> Due to a recent query posted by a developer I was interested to know which
> factors are affecting these two servers to behave in a different manner to
> the same query. Please shed light on this.


7.4 knows a great deal more about optimizing IN queries than 7.3 does.
I think you have misidentified the two query plans, because "Hash IN
Join" is not a plan type that 7.3 has at all. What is not clear to me
is which version is really slower. Could we see EXPLAIN ANALYZE output
(correctly labeled ;-)) not just EXPLAIN?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-10-2008, 01:35 AM
Tom Lane
 
Posts: n/a
Default Re: Why this Query Plan is different

"Thilina Gunasekara" <tgunasekara@openworld.org> writes:
> **** This is Box A which runs on 7.4.5 and delivers very quick result.
> ...
> ****** And this is the Box B which runs on 7.3.9 and takes hours to deliver
> the result.


Well, that's what I'd expect. There's a reason why we put a great deal
of effort into fixing IN in 7.4 ;-)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:04 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com