Unix Technical Forum

SEO

vBulletin Search Engine Optimization


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-18-2008, 11:09 AM
Gaetano Mendola
 
Posts: n/a
Default bad performances using hashjoin

Hi all,
I'm stuck in a select that use the hash join where should not:
6 seconds vs 0.3 ms !!

If you need other info in order to improve the planner,
let me know.

Regards
Gaetano Mendola




empdb=# explain analyze SELECT id_sat_request
empdb-# FROM sat_request sr,
empdb-# v_sc_packages vs
empdb-# WHERE ----- JOIN ----
empdb-# sr.id_package = vs.id_package AND
empdb-# ---------------
empdb-# id_user = 29416 AND
empdb-# id_url = 329268 AND
empdb-# vs.estimated_start > now() AND
empdb-# id_sat_request_status = sp_lookup_id('sat_request_status', 'Scheduled');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=272.95..276.61 rows=1 width=4) (actual time=6323.107..6323.107 rows=0 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Subquery Scan vs (cost=269.91..272.10 rows=292 width=4) (actual time=6316.534..6317.398 rows=407 loops=1)
-> Sort (cost=269.91..270.64 rows=292 width=263) (actual time=6316.526..6316.789 rows=407 loops=1)
Sort Key: vs.estimated_start
-> Hash Join (cost=227.58..257.95 rows=292 width=263) (actual time=6302.480..6313.982 rows=407 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Subquery Scan vpk (cost=141.82..150.04 rows=1097 width=218) (actual time=6106.020..6113.038 rows=1104 loops=1)
-> Sort (cost=141.82..144.56 rows=1097 width=162) (actual time=6106.006..6106.745 rows=1104 loops=1)
Sort Key: p.id_publisher, p.name
-> Hash Left Join (cost=15.54..86.42 rows=1097 width=162) (actual time=2.978..6087.608 rows=1104 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Seq Scan on packages p (cost=0.00..53.48 rows=1097 width=146) (actual time=0.011..7.978 rows=1104 loops=1)
-> Hash (cost=13.69..13.69 rows=738 width=20) (actual time=2.061..2.061 rows=0 loops=1)
-> Seq Scan on package_security ps (cost=0.00..13.69 rows=738 width=20) (actual time=0.027..1.289 rows=747 loops=1)
-> Hash (cost=85.63..85.63 rows=54 width=49) (actual time=196.022..196.022 rows=0 loops=1)
-> Merge Join (cost=82.83..85.63 rows=54 width=49) (actual time=192.898..195.565 rows=407 loops=1)
Merge Cond: ("outer".id_program = "inner".id_program)
-> Subquery Scan vs (cost=72.27..73.97 rows=226 width=16) (actual time=6.867..7.872 rows=407 loops=1)
-> Sort (cost=72.27..72.84 rows=226 width=20) (actual time=6.851..7.114 rows=407 loops=1)
Sort Key: sequences.id_program, sequences.internal_position
-> Seq Scan on sequences (cost=0.00..63.44 rows=226 width=20) (actual time=0.295..3.370 rows=407 loops=1)
Filter: ((estimated_start IS NOT NULL) AND (date_trunc('seconds'::text, estimated_start) > now()))
-> Sort (cost=10.56..10.68 rows=47 width=37) (actual time=186.013..186.296 rows=439 loops=1)
Sort Key: vpr.id_program
-> Subquery Scan vpr (cost=8.90..9.25 rows=47 width=37) (actual time=185.812..185.928 rows=48 loops=1)
-> Sort (cost=8.90..9.02 rows=47 width=61) (actual time=185.806..185.839 rows=48 loops=1)
Sort Key: programs.id_publisher, programs.id_program
-> Seq Scan on programs (cost=0.00..7.60 rows=47 width=61) (actual time=9.592..185.634 rows=48 loops=1)
Filter: (id_program <> 0)
-> Hash (cost=3.04..3.04 rows=1 width=8) (actual time=4.862..4.862 rows=0 loops=1)
-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=4.851..4.851 rows=0 loops=1)
Index Cond: (id_url = 329268)
Filter: ((id_user = 29416) AND (id_sat_request_status = 1))
Total runtime: 6324.435 ms
(35 rows)

empdb=# set enable_hashjoin = false;
SET
empdb=# explain analyze SELECT id_sat_request
empdb-# FROM sat_request sr,
empdb-# v_sc_packages vs
empdb-# WHERE ----- JOIN ----
empdb-# sr.id_package = vs.id_package AND
empdb-# ---------------
empdb-# id_user = 29416 AND
empdb-# id_url = 329268 AND
empdb-# vs.estimated_start > now() AND
empdb-# id_sat_request_status = sp_lookup_id('sat_request_status', 'Scheduled');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=393.41..400.83 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=1)
Join Filter: ("outer".id_package = "inner".id_package)
-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=0.078..0.078 rows=0 loops=1)
Index Cond: (id_url = 329268)
Filter: ((id_user = 29416) AND (id_sat_request_status = 1))
-> Subquery Scan vs (cost=393.41..395.60 rows=292 width=4) (never executed)
-> Sort (cost=393.41..394.14 rows=292 width=263) (never executed)
Sort Key: vs.estimated_start
-> Merge Join (cost=372.76..381.46 rows=292 width=263) (never executed)
Merge Cond: ("outer".id_package = "inner".id_package)
-> Sort (cost=87.19..87.32 rows=54 width=49) (never executed)
Sort Key: vs.id_package
-> Merge Join (cost=82.83..85.63 rows=54 width=49) (never executed)
Merge Cond: ("outer".id_program = "inner".id_program)
-> Subquery Scan vs (cost=72.27..73.97 rows=226 width=16) (never executed)
-> Sort (cost=72.27..72.84 rows=226 width=20) (never executed)
Sort Key: sequences.id_program, sequences.internal_position
-> Seq Scan on sequences (cost=0.00..63.44 rows=226 width=20) (never executed)
Filter: ((estimated_start IS NOT NULL) AND (date_trunc('seconds'::text, estimated_start) > now()))
-> Sort (cost=10.56..10.68 rows=47 width=37) (never executed)
Sort Key: vpr.id_program
-> Subquery Scan vpr (cost=8.90..9.25 rows=47 width=37) (never executed)
-> Sort (cost=8.90..9.02 rows=47 width=61) (never executed)
Sort Key: programs.id_publisher, programs.id_program
-> Seq Scan on programs (cost=0.00..7.60 rows=47 width=61) (never executed)
Filter: (id_program <> 0)
-> Sort (cost=285.57..288.31 rows=1097 width=218) (never executed)
Sort Key: vpk.id_package
-> Subquery Scan vpk (cost=221.95..230.17 rows=1097 width=218) (never executed)
-> Sort (cost=221.95..224.69 rows=1097 width=162) (never executed)
Sort Key: p.id_publisher, p.name
-> Merge Right Join (cost=108.88..166.55 rows=1097 width=162) (never executed)
Merge Cond: ("outer".id_package = "inner".id_package)
-> Index Scan using package_security_id_package_key on package_security ps (cost=0.00..38.50 rows=738 width=20) (never executed)
-> Sort (cost=108.88..111.62 rows=1097 width=146) (never executed)
Sort Key: p.id_package
-> Seq Scan on packages p (cost=0.00..53.48 rows=1097 width=146) (never executed)
Total runtime: 0.302 ms
(38 rows)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-18-2008, 11:09 AM
Tom Lane
 
Posts: n/a
Default Re: bad performances using hashjoin

Gaetano Mendola <mendola@bigfoot.com> writes:
> If you need other info in order to improve the planner,


.... like, say, the PG version you are using, or the definitions of the
views involved? It's difficult to say much of anything about this.

However: the reason the second plan wins is because there are zero rows
fetched from sat_request, and so the bulk of the plan is never executed
at all. I doubt the second plan would win if there were any matching
sat_request rows. If this is the case you actually need to optimize,
probably the thing to do is to get rid of the ORDER BY clauses you
evidently have in your views, so that there's some chance of building
a fast-start plan.

BTW, I believe in 8.0 the first plan would be about as fast as the
second, because we added some code to hash join to fall out without
scanning the left input if the right input is empty.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-18-2008, 11:09 AM
Gaetano Mendola
 
Posts: n/a
Default Re: bad performances using hashjoin

Tom Lane wrote:
> Gaetano Mendola <mendola@bigfoot.com> writes:
>
>>If you need other info in order to improve the planner,

>
>
> ... like, say, the PG version you are using, or the definitions of the
> views involved? It's difficult to say much of anything about this.


That is true, sorry I forgot it :-(
The engine is a 7.4.5 and these are the views definitions:

sat_request is just a table

CREATE OR REPLACE VIEW v_sc_packages AS
SELECT *
FROM
v_programs vpr,
v_packages vpk,
v_sequences vs
WHERE
------------ JOIN -------------
vpr.id_program = vs.id_program AND
vpk.id_package = vs.id_package AND
-------------------------------
vs.estimated_start IS NOT NULL
ORDER BY vs.estimated_start;


CREATE OR REPLACE VIEW v_programs AS
SELECT *
FROM programs
WHERE id_program<>0
ORDER BY id_publisher, id_program
;



CREATE OR REPLACE VIEW v_packages AS
SELECT *
FROM packages p LEFT OUTER JOIN package_security ps USING (id_package)
ORDER BY p.id_publisher, p.name
;

CREATE OR REPLACE VIEW v_sequences AS
SELECT id_package AS id_package,
id_program AS id_program,
internal_position AS internal_position,
estimated_start AS estimated_start
FROM sequences
ORDER BY id_program, internal_position
;


> However: the reason the second plan wins is because there are zero rows
> fetched from sat_request, and so the bulk of the plan is never executed
> at all. I doubt the second plan would win if there were any matching
> sat_request rows. If this is the case you actually need to optimize,
> probably the thing to do is to get rid of the ORDER BY clauses you
> evidently have in your views, so that there's some chance of building
> a fast-start plan.


Removed all order by from that views, this is the comparison between the two
(orderdered and not ordered):

empdb=# explain analyze SELECT id_sat_request
empdb-# FROM sat_request sr,
empdb-# v_sc_packages vs
empdb-# WHERE ----- JOIN ----
empdb-# sr.id_package = vs.id_package AND
empdb-# ---------------
empdb-# id_user = 29416 AND
empdb-# id_url = 424364 AND
empdb-# vs.estimated_start > now() AND
empdb-# id_sat_request_status = sp_lookup_id('sat_request_status', 'Scheduled');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=280.98..284.74 rows=1 width=4) (actual time=895.344..895.344 rows=0 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Subquery Scan vs (cost=277.94..280.19 rows=301 width=4) (actual time=893.191..894.396 rows=569 loops=1)
-> Sort (cost=277.94..278.69 rows=301 width=263) (actual time=893.184..893.546 rows=569 loops=1)
Sort Key: vs.estimated_start
-> Hash Join (cost=232.61..265.54 rows=301 width=263) (actual time=868.980..889.643 rows=569 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Subquery Scan vpk (cost=150.29..159.26 rows=1196 width=218) (actual time=822.281..834.063 rows=1203 loops=1)
-> Sort (cost=150.29..153.28 rows=1196 width=159) (actual time=822.266..823.190 rows=1203 loops=1)
Sort Key: p.id_publisher, p.name
-> Hash Left Join (cost=16.14..89.16 rows=1196 width=159) (actual time=3.504..809.262 rows=1203 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Seq Scan on packages p (cost=0.00..53.98 rows=1196 width=143) (actual time=0.018..13.869 rows=1203 loops=1)
-> Hash (cost=14.09..14.09 rows=818 width=20) (actual time=2.395..2.395 rows=0 loops=1)
-> Seq Scan on package_security ps (cost=0.00..14.09 rows=818 width=20) (actual time=0.020..1.520 rows=845 loops=1)
-> Hash (cost=82.19..82.19 rows=51 width=49) (actual time=46.402..46.402 rows=0 loops=1)
-> Merge Join (cost=79.54..82.19 rows=51 width=49) (actual time=39.435..45.376 rows=569 loops=1)
Merge Cond: ("outer".id_program = "inner".id_program)
-> Subquery Scan vs (cost=70.98..72.59 rows=214 width=16) (actual time=16.834..19.102 rows=569 loops=1)
-> Sort (cost=70.98..71.52 rows=214 width=20) (actual time=16.824..17.338 rows=569 loops=1)
Sort Key: sequences.id_program, sequences.internal_position
-> Seq Scan on sequences (cost=0.00..62.70 rows=214 width=20) (actual time=0.638..11.969 rows=569 loops=1)
Filter: ((estimated_start IS NOT NULL) AND (date_trunc('seconds'::text, estimated_start) > now()))
-> Sort (cost=8.56..8.68 rows=47 width=37) (actual time=22.580..23.123 rows=605 loops=1)
Sort Key: vpr.id_program
-> Subquery Scan vpr (cost=6.90..7.25 rows=47 width=37) (actual time=22.294..22.464 rows=48 loops=1)
-> Sort (cost=6.90..7.02 rows=47 width=61) (actual time=22.287..22.332 rows=48 loops=1)
Sort Key: programs.id_publisher, programs.id_program
-> Seq Scan on programs (cost=0.00..5.60 rows=47 width=61) (actual time=4.356..22.068 rows=48 loops=1)
Filter: (id_program <> 0)
-> Hash (cost=3.04..3.04 rows=1 width=8) (actual time=0.033..0.033 rows=0 loops=1)
-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=0.031..0.031 rows=0 loops=1)
Index Cond: (id_url = 424364)
Filter: ((id_user = 29416) AND (id_sat_request_status = 1))
Total runtime: 897.044 ms
(35 rows)



empdb=# explain analyze SELECT id_sat_request
empdb-# FROM sat_request sr,
empdb-# v_sc_packages_new vs
empdb-# WHERE ----- JOIN ----
empdb-# sr.id_package = vs.id_package AND
empdb-# ---------------
empdb-# id_user = 29416 AND
empdb-# id_url = 424364 AND
empdb-# vs.estimated_start > now() AND
empdb-# id_sat_request_status = sp_lookup_id('sat_request_status', 'Scheduled');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=19.18..96.87 rows=1 width=4) (actual time=15.576..15.576 rows=0 loops=1)
-> Nested Loop (cost=19.18..93.04 rows=1 width=8) (actual time=15.569..15.569 rows=0 loops=1)
-> Hash Join (cost=19.18..89.21 rows=1 width=12) (actual time=15.566..15.566 rows=0 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Hash Left Join (cost=16.14..80.19 rows=1196 width=4) (actual time=7.291..13.620 rows=1203 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Seq Scan on packages p (cost=0.00..53.98 rows=1196 width=4) (actual time=0.028..2.694 rows=1203 loops=1)
-> Hash (cost=14.09..14.09 rows=818 width=4) (actual time=6.707..6.707 rows=0 loops=1)
-> Seq Scan on package_security ps (cost=0.00..14.09 rows=818 width=4) (actual time=0.026..4.620 rows=845 loops=1)
-> Hash (cost=3.04..3.04 rows=1 width=8) (actual time=0.061..0.061 rows=0 loops=1)
-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=0.056..0.056 rows=0 loops=1)
Index Cond: (id_url = 424364)
Filter: ((id_user = 29416) AND (id_sat_request_status = 1))
-> Index Scan using idx_sequences_id_package on sequences (cost=0.00..3.82 rows=1 width=8) (never executed)
Index Cond: ("outer".id_package = sequences.id_package)
Filter: ((estimated_start IS NOT NULL) AND (date_trunc('seconds'::text, estimated_start) > now()))
-> Index Scan using programs_pkey on programs (cost=0.00..3.83 rows=1 width=4) (never executed)
Index Cond: (programs.id_program = "outer".id_program)
Filter: (id_program <> 0)
Total runtime: 16.279 ms
(20 rows)




The second one of course is faster, this is the second select with hashjoin disabled:

empdb=# set enable_hashjoin = false;
SET
empdb=# explain analyze SELECT id_sat_request
empdb-# FROM sat_request sr,
empdb-# v_sc_packages_new vs
empdb-# WHERE ----- JOIN ----
empdb-# sr.id_package = vs.id_package AND
empdb-# ---------------
empdb-# id_user = 29416 AND
empdb-# id_url = 424364 AND
empdb-# vs.estimated_start > now() AND
empdb-# id_sat_request_status = sp_lookup_id('sat_request_status', 'Scheduled');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=10.62..175.83 rows=1 width=4) (actual time=0.280..0.280 rows=0 loops=1)
Merge Cond: ("outer".id_package = "inner".id_package)
-> Merge Left Join (cost=0.00..162.21 rows=1196 width=4) (actual time=0.188..0.188 rows=1 loops=1)
Merge Cond: ("outer".id_package = "inner".id_package)
-> Index Scan using packages_pkey on packages p (cost=0.00..115.51 rows=1196 width=4) (actual time=0.085..0.085 rows=1 loops=1)
-> Index Scan using package_security_id_package_key on package_security ps (cost=0.00..39.06 rows=818 width=4) (actual time=0.080..0.080 rows=1 loops=1)
-> Sort (cost=10.62..10.62 rows=1 width=12) (actual time=0.087..0.087 rows=0 loops=1)
Sort Key: sr.id_package
-> Nested Loop (cost=0.00..10.61 rows=1 width=12) (actual time=0.069..0.069 rows=0 loops=1)
-> Nested Loop (cost=0.00..6.77 rows=1 width=16) (actual time=0.067..0.067 rows=0 loops=1)
-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=0.065..0.065 rows=0 loops=1)
Index Cond: (id_url = 424364)
Filter: ((id_user = 29416) AND (id_sat_request_status = 1))
-> Index Scan using idx_sequences_id_package on sequences (cost=0.00..3.72 rows=1 width=8) (never executed)
Index Cond: ("outer".id_package = sequences.id_package)
Filter: ((estimated_start IS NOT NULL) AND (date_trunc('seconds'::text, estimated_start) > now()))
-> Index Scan using programs_pkey on programs (cost=0.00..3.83 rows=1 width=4) (never executed)
Index Cond: (programs.id_program = "outer".id_program)
Filter: (id_program <> 0)
Total runtime: 0.604 ms
(20 rows)

I see the problem is still here:
Hash Left Join (cost=16.14..80.19 rows=1196 width=4) (actual time=7.291..13.620 rows=1203 loops=1)

BTW, at this time the executions time seen are lower because right now is not a peak hour

> BTW, I believe in 8.0 the first plan would be about as fast as the
> second, because we added some code to hash join to fall out without
> scanning the left input if the right input is empty.


I'll take it a try if you are really interested in the results.

Regards
Gaetano Mendola










Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-18-2008, 11:09 AM
Klint Gore
 
Posts: n/a
Default Re: bad performances using hashjoin

On Sun, 20 Feb 2005 13:46:10 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> sat_request rows. If this is the case you actually need to optimize,
> probably the thing to do is to get rid of the ORDER BY clauses you
> evidently have in your views, so that there's some chance of building
> a fast-start plan.


Is having an order by in a view legal? In sybase ASA, mssql it throws a
syntax error if there's an order by.

If so, does it do 2 sorts when you sort by something else?

i.e. if view is
create view v1 as select * from table order by 1;
and the statment
select * from v1 order by 2;
is run does it sort by 1 then resort by 2?

klint.

+---------------------------------------+-----------------+
: Klint Gore : "Non rhyming :
: EMail : kg@kgb.une.edu.au : slang - the :
: Snail : A.B.R.I. : possibilities :
: Mail University of New England : are useless" :
: Armidale NSW 2351 Australia : L.J.J. :
: Fax : +61 2 6772 5376 : :
+---------------------------------------+-----------------+

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-18-2008, 11:09 AM
Tom Lane
 
Posts: n/a
Default Re: bad performances using hashjoin

Klint Gore <kg@kgb.une.edu.au> writes:
> Is having an order by in a view legal?


Not according to the SQL spec, but PG has allowed it for several releases.
(The same goes for ORDER BY in a sub-select, which is actually pretty
much the same thing ...)

> If so, does it do 2 sorts when you sort by something else?


Yup. It's something you'd only want for the topmost view in a stack,
IMHO. A sort at a lower level is likely to be wasted effort.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-18-2008, 11:09 AM
David Brown
 
Posts: n/a
Default Re: bad performances using hashjoin

Tom Lane wrote:

>However: the reason the second plan wins is because there are zero rows
>fetched from sat_request, and so the bulk of the plan is never executed
>at all. I doubt the second plan would win if there were any matching
>sat_request rows.
>

That's what I thought at first, but if you look more closely, that's
having very little impact on either the cost or actual time:

-> Index Scan using idx_id_url_sat_request on sat_request sr (cost=0.00..3.04 rows=1 width=8) (actual time=0.031..0.031 rows=0 loops=1)


The real problem appears to be here:

-> Hash Left Join (cost=16.14..89.16 rows=1196 width=159) (actual time=3.504..809.262 rows=1203 loops=1)


As Gaetano points out in his follow-up post, the problem still exists
after he removed the sorts:

-> Hash Left Join (cost=16.14..80.19 rows=1196 width=4) (actual time=7.291..13.620 rows=1203 loops=1)


The planner is not breaking up the outer join in his v_packages view:
SELECT *
FROM packages p LEFT OUTER JOIN package_security ps USING (id_package)

It's not being selective at all with packages, despite id_package being
the link to sat_request.

If this is too complex for the planner, could he re-arrange his outer
join so that's it's processed later? If he puts it in his actual query,
for instance, will the planner flatten it out anyway?

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-18-2008, 11:09 AM
Tom Lane
 
Posts: n/a
Default Re: bad performances using hashjoin

David Brown <time@bigpond.net.au> writes:
> The planner is not breaking up the outer join in his v_packages view:


The planner doesn't make any attempt to rearrange join order of outer
joins. There are some cases where such a rearrangement is OK, but there
are other cases where it isn't, and we don't currently have the logic
needed to tell which is which.

In the particular case at hand here, 8.0's hack to suppress evaluating
the outer side of a hash join after finding the inner side is empty
would eliminate the complaint.

In the original message, it did seem that the packages-to-
package_security join is taking a lot longer than one would expect:

-> Hash Left Join (cost=15.54..86.42 rows=1097 width=162) (actual time=2.978..6087.608 rows=1104 loops=1)
Hash Cond: ("outer".id_package = "inner".id_package)
-> Seq Scan on packages p (cost=0.00..53.48 rows=1097 width=146) (actual time=0.011..7.978 rows=1104 loops=1)
-> Hash (cost=13.69..13.69 rows=738 width=20) (actual time=2.061..2.061 rows=0 loops=1)
-> Seq Scan on package_security ps (cost=0.00..13.69 rows=738 width=20) (actual time=0.027..1.289 rows=747 loops=1)

but this behavior isn't reproduced in the later message, so I wonder if
it wasn't an artifact of something else taking a chunk of time.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-18-2008, 11:09 AM
Gaetano Mendola
 
Posts: n/a
Default Re: bad performances using hashjoin

Tom Lane wrote:

> but this behavior isn't reproduced in the later message, so I wonder if
> it wasn't an artifact of something else taking a chunk of time.


I think is due the fact that first queries were performed in peakhours.


Regards
Gaetano Mendola


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-18-2008, 11:09 AM
David Brown
 
Posts: n/a
Default Re: bad performances using hashjoin

Gaetano Mendola wrote:

>I think is due the fact that first queries were performed in peakhours.
>
>

A memory intensive operation takes 7.5 times longer during heavy loads.
Doesn't this suggest that swapping of working memory is occurring?


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-18-2008, 11:09 AM
Mischa
 
Posts: n/a
Default Re: bad performances using hashjoin

Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> Klint Gore <kg@kgb.une.edu.au> writes:
> > Is having an order by in a view legal?

>
> Not according to the SQL spec, but PG has allowed it for several releases.
> (The same goes for ORDER BY in a sub-select, which is actually pretty
> much the same thing ...)


Umm... if you implement LIMIT in a subselect, it becomes highly meaningful (nad
useful.

Is this a case of one nonstandard feature being the thin edge of the wedge
for another?
--
"Dreams come true, not free."


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:21 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
UnixAdminTalk.com

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 <