vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I must be missing something important, because I am just not seeing why this query is slower on a 4 processor 8 gig machine running redhat AS4. The SQL: explain analyze SELECT a.clientnum, a.associateid, a.associatenum, a.lastname, a.firstname, jt.value AS jobtitle, l.name AS "location", l.locationid AS mainlocationid, l.divisionid, l.regionid, l.districtid, (a.lastname::text || ', '::text) || a.firstname::text AS assocname, a.isactive, a.isdeleted FROM tblassociate a left JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND jt.clientnum::text = a.clientnum::text AND 1 = jt.presentationid JOIN tbllocation l ON a.locationid = l.locationid AND l.clientnum::text = a.clientnum::text where a.clientnum = 'SAKS'; Machine 1 my desktop: "Merge Join (cost=74970.51..75975.46 rows=8244 width=113) (actual time=5141.000..6363.000 rows=160593 loops=1)" " Merge Cond: ("outer".locationid = "inner".locationid)" " -> Sort (cost=656.22..657.11 rows=354 width=49) (actual time=16.000..16.000 rows=441 loops=1)" " Sort Key: l.locationid" " -> Index Scan using ix_location on tbllocation l (cost=0.00..641.23 rows=354 width=49) (actual time=0.000..0.000 rows=441 loops=1)" " Index Cond: ('SAKS'::text = (clientnum)::text)" " -> Sort (cost=74314.29..74791.06 rows=190710 width=75) (actual time=5125.000..5316.000 rows=160594 loops=1)" " Sort Key: a.locationid" " -> Merge Right Join (cost=0.00..52366.50 rows=190710 width=75) (actual time=16.000..1973.000 rows=177041 loops=1)" " Merge Cond: ((("outer".clientnum)::text = ("inner".clientnum)::text) AND ("outer".id = "inner".jobtitleid))" " -> Index Scan using ix_tbljobtitle_id on tbljobtitle jt (cost=0.00..244.75 rows=6622 width=37) (actual time=0.000..16.000 rows=5690 loops=1)" " Filter: (1 = presentationid)" " -> Index Scan using ix_tblassoc_jobtitleid on tblassociate a (cost=0.00..50523.83 rows=190710 width=53) (actual time=0.000..643.000 rows=177041 loops=1)" " Index Cond: ((clientnum)::text = 'SAKS'::text)" "Total runtime: 6719.000 ms" Test Linux machine: "Merge Join (cost=48126.04..49173.57 rows=15409 width=113) (actual time=11832.165..12678.025 rows=160593 loops=1)" " Merge Cond: ("outer".locationid = "inner".locationid)" " -> Sort (cost=807.64..808.75 rows=443 width=49) (actual time=2.418..2.692 rows=441 loops=1)" " Sort Key: l.locationid" " -> Index Scan using ix_location on tbllocation l (cost=0.00..788.17 rows=443 width=49) (actual time=0.036..1.677 rows=441 loops=1)" " Index Cond: ('SAKS'::text = (clientnum)::text)" " -> Sort (cost=47318.40..47758.44 rows=176015 width=75) (actual time=11829.660..12002.746 rows=160594 loops=1)" " Sort Key: a.locationid" " -> Merge Right Join (cost=24825.80..27512.71 rows=176015 width=75) (actual time=8743.848..9750.775 rows=177041 loops=1)" " Merge Cond: ((("outer".clientnum)::text = "inner"."?column10?") AND ("outer".id = "inner".jobtitleid))" " -> Index Scan using ix_tbljobtitle_id on tbljobtitle jt (cost=0.00..239.76 rows=6604 width=37) (actual time=0.016..11.323 rows=5690 loops=1)" " Filter: (1 = presentationid)" " -> Sort (cost=24825.80..25265.84 rows=176015 width=53) (actual time=8729.320..8945.292 rows=177041 loops=1)" " Sort Key: (a.clientnum)::text, a.jobtitleid" " -> Index Scan using ix_associate_clientnum on tblassociate a (cost=0.00..9490.20 rows=176015 width=53) (actual time=0.036..1071.867 rows=177041 loops=1)" " Index Cond: ((clientnum)::text = 'SAKS'::text)" "Total runtime: 12802.019 ms" I tried to remove the left outer thinking it would speed it up, and it used a seq search on tblassoc and ran 2 times slower. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 jfradkin@wazagua.com www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| Joel Fradkin wrote: > I must be missing something important, because I am just not seeing why this > query is slower on a 4 processor 8 gig machine running redhat AS4. Well, the 4 processors aren't going to help with a single query. However, assuming the configurations for both machines are comparable, you shouldn't be seeing a doubling in query-time. I have, however, spotted something very strange towards the bottom of each explain: > Machine 1 my desktop: > " -> Merge Right Join (cost=0.00..52366.50 rows=190710 width=75) > (actual time=16.000..1973.000 rows=177041 loops=1)" > " Merge Cond: ((("outer".clientnum)::text = > ("inner".clientnum)::text) AND ("outer".id = "inner".jobtitleid))" > Test Linux machine: > " -> Merge Right Join (cost=24825.80..27512.71 rows=176015 > width=75) (actual time=8743.848..9750.775 rows=177041 loops=1)" > " Merge Cond: ((("outer".clientnum)::text = > "inner"."?column10?") AND ("outer".id = "inner".jobtitleid))" In the first, we match outer.clientnum to inner.clientnum, in the second it's "?column10?" - are you sure the query was identical in each case. I'm guessing the unidentified column in query 2 is the reason for the sort a couple of lines below it, which seems to take up a large chunk of time. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| are you sure the query was identical in each case. I just ran a second time same results ensuring that the query is the same. Not sure why it is doing a column10 thing. Any ideas what to look for? Both data bases are a restore from the same backup file. One is running redhat the other XP, I believe both are the same version of postgres except for the different platform (8.0.1 I am pretty sure). I just spent the morning with Dell hoping for some explanation from them. They said I had to have the database on the same type of OS and hardware for them to think the issue was hardware. They are escalating to the software group. I did a default Redhat install so it very well may be an issue with my lack of knowledge on Linux. He did mention by default the Perc4 do cache, so I may need to visit the data center to set the percs to not cache. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| Richard Huxton <dev@archonet.com> writes: > In the first, we match outer.clientnum to inner.clientnum, in the second > it's "?column10?" - are you sure the query was identical in each case. > I'm guessing the unidentified column in query 2 is the reason for the > sort a couple of lines below it, which seems to take up a large chunk of > time. The "?column10?" is because EXPLAIN isn't excessively bright about reporting references to outputs of lower plan nodes. (Gotta fix that sometime.) The real point here is that the planner thought that a scan plus sort would be faster than scanning an index that exactly matched the sort order the Merge Join needed ... and it was wrong :-( So this is just the usual sort of question of "are your stats up to date, maybe you need to increase stats targets, or else play with random_page_cost, etc" ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings |
| |||
| I have done a vacuum and a vacuum analyze. I can try again for kicks, but it is not in production so no new records are added and vacuum analyze is ran after any mods to the indexes. I am still pursuing Dell on why the monster box is so much slower then the desktop as well. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 jfradkin@wazagua.com www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, April 14, 2005 1:47 AM To: Richard Huxton Cc: Joel Fradkin; PostgreSQL Perform Subject: Re: [PERFORM] speed of querry? Richard Huxton <dev@archonet.com> writes: > In the first, we match outer.clientnum to inner.clientnum, in the second > it's "?column10?" - are you sure the query was identical in each case. > I'm guessing the unidentified column in query 2 is the reason for the > sort a couple of lines below it, which seems to take up a large chunk of > time. The "?column10?" is because EXPLAIN isn't excessively bright about reporting references to outputs of lower plan nodes. (Gotta fix that sometime.) The real point here is that the planner thought that a scan plus sort would be faster than scanning an index that exactly matched the sort order the Merge Join needed ... and it was wrong :-( So this is just the usual sort of question of "are your stats up to date, maybe you need to increase stats targets, or else play with random_page_cost, etc" ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| On 4/14/05, Joel Fradkin <jfradkin@wazagua.com> wrote: > I have done a vacuum and a vacuum analyze. > I can try again for kicks, but it is not in production so no new records are > added and vacuum analyze is ran after any mods to the indexes. > > I am still pursuing Dell on why the monster box is so much slower then the > desktop as well. First thing: Do something like: ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50; make it for each column used, make it even higher than 50 for many-values columns. THEN make VACUUM ANALYZE; Then do a query couple of times (EXPLAIN ANALYZE also SET enable_seqscan = off; and rerun the query -- if it was significantly faster, you will want to do: SET enable_seqscan = on; and tweak: SET random_page_cost = 2.1; ....and play with values. When you reach the random_page_cost which suits your data, you will want to put it into postgresql.conf I am sorry if it is already known to you. approach to tuning PostgreSQL but it is worth doing. Especially the statistics part. Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| Josh from commandprompt.com had me alter the config to have default_statistics_target = 250 Is this somehow related to what your asking me to do? I did do an analyze, but have only ran the viw a few times. Joel Fradkin -----Original Message----- From: Dawid Kuroczko [mailto:qnex42@gmail.com] Sent: Thursday, April 14, 2005 9:21 AM To: Joel Fradkin Cc: PostgreSQL Perform Subject: Re: [PERFORM] speed of querry? On 4/14/05, Joel Fradkin <jfradkin@wazagua.com> wrote: > I have done a vacuum and a vacuum analyze. > I can try again for kicks, but it is not in production so no new records are > added and vacuum analyze is ran after any mods to the indexes. > > I am still pursuing Dell on why the monster box is so much slower then the > desktop as well. First thing: Do something like: ALTER TABLE tbljobtitle ALTER COLUMN clientnum SET STATISTICS 50; make it for each column used, make it even higher than 50 for many-values columns. THEN make VACUUM ANALYZE; Then do a query couple of times (EXPLAIN ANALYZE also SET enable_seqscan = off; and rerun the query -- if it was significantly faster, you will want to do: SET enable_seqscan = on; and tweak: SET random_page_cost = 2.1; ....and play with values. When you reach the random_page_cost which suits your data, you will want to put it into postgresql.conf I am sorry if it is already known to you. simplistic approach to tuning PostgreSQL but it is worth doing. Especially the statistics part. Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| |||
| On 4/14/05, Joel Fradkin <jfradkin@wazagua.com> wrote: > Josh from commandprompt.com had me alter the config to have > default_statistics_target = 250 > > Is this somehow related to what your asking me to do? > I did do an analyze, but have only ran the viw a few times. well, he did suggest the right thing. However this parameter applies to newly created tables, so either recreate the tables or do the ALTER TABLE I've sent eariler. Basically it tells postgres how many values should it keep for statistics per column. The config default_statistics_target is the default (= used when creating table) and ALTER... is a way to change it later. The more statistics PostgreSQL has means it can better predict how much data will be returned -- and this directly leads to a choice how to handle the data (order in which tables should be read, whether to use index or not, which algorithm use for join, etc.). The more statistics, the better PostgreSQL is able to predict. The more statistics, the slower planner is able to do the analysis. So you have to find a value which will be as much as is needed to accurately predict the results but not more! PostgreSQL's default of 10 is a bit conservative, hence the suggestions to increase it. found that in their cases it is beneficial to reduce the value, even as much as below 2. ] Hope this clairifies things a bit. Regards, Dawid ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| Dawid Kuroczko <qnex42@gmail.com> writes: > Basically it tells postgres how many values should it keep for > statistics per column. The config default_statistics_target > is the default (= used when creating table) and ALTER... is > a way to change it later. Not quite. default_statistics_target is the value used by ANALYZE for any column that hasn't had an explicit ALTER SET STATISTICS done on it. So you can change default_statistics_target and that will affect existing tables. (It used to work the way you are saying, but that was a few releases back...) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| ||||
| I did as described to alter table and did not see any difference in speed. I am trying to undo the symbolic link to the data array and set it up on raid 5 disks in the machine just to test if there is an issue with the config of the raid 10 array or a problem with the controller. I am kinda lame at Linux so not sure I have got it yet still testing. Still kind puzzled why it chose tow different option, but one is running windows version of postgres, so maybe that has something to do with it. The data bases and configs (as far as page cost) are the same. Joel Fradkin Wazagua, Inc. 2520 Trailmate Dr Sarasota, Florida 34243 Tel. 941-753-7111 ext 305 jfradkin@wazagua.com www.wazagua.com Powered by Wazagua Providing you with the latest Web-based technology & advanced tools. C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc This email message is for the use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and delete and destroy all copies of the original message, including attachments. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, April 14, 2005 11:21 AM To: Dawid Kuroczko Cc: Joel Fradkin; PERFORM Subject: Re: [PERFORM] speed of querry? Dawid Kuroczko <qnex42@gmail.com> writes: > Basically it tells postgres how many values should it keep for > statistics per column. The config default_statistics_target > is the default (= used when creating table) and ALTER... is > a way to change it later. Not quite. default_statistics_target is the value used by ANALYZE for any column that hasn't had an explicit ALTER SET STATISTICS done on it. So you can change default_statistics_target and that will affect existing tables. (It used to work the way you are saying, but that was a few releases back...) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |