vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I have two tables t1 and t2 containing 10K tuples each. I tried to do a non equi-join on these tables using the following query. select /*+ ORDERED USE_MERGE (t2 t1)*/count(*) from test2 t2, test1 t1 where t2.a > t1.a; A simple implementaion of Sort merge based non equi join would have sorted both relations and iterated through the outer relation returning all the matching tuples. However when I checked the execution plan made by Oracle. this is wot it shows Rows Row Source Operation ------- --------------------------------------------------- 1 SORT AGGREGATE (cr=76 pr=0 pw=0 time=26809424 us) 49995000 MERGE JOIN (cr=76 pr=0 pw=0 time=150014986 us) 10000 SORT JOIN (cr=38 pr=0 pw=0 time=26064 us) 10000 TABLE ACCESS FULL TEST2 (cr=38 pr=0 pw=0 time=10086 us) 49995000 SORT JOIN (cr=38 pr=0 pw=0 time=50031081 us) 10000 TABLE ACCESS FULL TEST1 (cr=38 pr=0 pw=0 time=10055 us) Why does it sort the inner relation multiple times? Is there a way to avoid it? Any help would be greatly appreciated!! best regards, Derek. |
| |||
| dereksmi@gmail.com wrote: > Hi, > > I have two tables t1 and t2 containing 10K tuples each. I tried to do a > non equi-join on these tables using the following query. > > select /*+ ORDERED USE_MERGE (t2 t1)*/count(*) from test2 t2, test1 t1 > where t2.a > t1.a; > > A simple implementaion of Sort merge based non equi join would have > sorted both relations and iterated through the outer relation returning > all the matching tuples. However when I checked the execution plan made > by Oracle. this is wot it shows > > > Rows Row Source Operation > ------- --------------------------------------------------- > 1 SORT AGGREGATE (cr=76 pr=0 pw=0 time=26809424 us) > 49995000 MERGE JOIN (cr=76 pr=0 pw=0 time=150014986 us) > 10000 SORT JOIN (cr=38 pr=0 pw=0 time=26064 us) > 10000 TABLE ACCESS FULL TEST2 (cr=38 pr=0 pw=0 time=10086 us) > 49995000 SORT JOIN (cr=38 pr=0 pw=0 time=50031081 us) > 10000 TABLE ACCESS FULL TEST1 (cr=38 pr=0 pw=0 time=10055 us) > > > Why does it sort the inner relation multiple times? Is there a way to > avoid it? Any help would be greatly appreciated!! > > best regards, > Derek. Why do you post the same message multiple times? Such behaviour does not bode well for a useful response. Post your question ONCE, and in properl English, and wait; should someone be able to provide enlightenment, they will. I've created and loaded two tables, t1 and t2, with sequential data. The query plan I generate for your query is: -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 26 | 146 | | 1 | SORT AGGREGATE | | 1 | 26 | | | 2 | MERGE JOIN | | 5640K| 139M| 146 | | 3 | SORT JOIN | | 10001 | 126K| 69 | | 4 | TABLE ACCESS FULL| T1 | 10001 | 126K| 2 | |* 5 | SORT JOIN | | 11280 | 143K| 77 | | 6 | TABLE ACCESS FULL| T2 | 11280 | 143K| 2 | -------------------------------------------------------------------- Additinoal information I obtained: Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T2"."A">"T1"."A") filter("T2"."A">"T1"."A") Modifying your query to remove the hint and generating another plan produces: -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | 1 | SORT AGGREGATE | | | | | | 2 | NESTED LOOPS | | | | | | 3 | TABLE ACCESS FULL | T2 | | | | |* 4 | TABLE ACCESS FULL | T1 | | | | -------------------------------------------------------------------- Your hint, it appears, is causing you grief, as the un-hinted query performs as you'd expect. Notice, too, no costing information is present in this plan. dbms_xplan.display reveals this is a rule-based optimisation. Generating statistics on these tables and running your query again: -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 98 | | 1 | SORT AGGREGATE | | 1 | 6 | | | 2 | MERGE JOIN | | 5000K| 28M| 98 | | 3 | SORT JOIN | | 10000 | 30000 | 49 | | 4 | TABLE ACCESS FULL| T1 | 10000 | 30000 | 9 | |* 5 | SORT JOIN | | 10000 | 30000 | 49 | | 6 | TABLE ACCESS FULL| T2 | 10000 | 30000 | 9 | -------------------------------------------------------------------- No real difference except the cost. Generating another plan for the un-hinted query (this time using the CBO instead of the RBO): -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 98 | | 1 | SORT AGGREGATE | | 1 | 6 | | | 2 | MERGE JOIN | | 5000K| 28M| 98 | | 3 | SORT JOIN | | 10000 | 30000 | 49 | | 4 | TABLE ACCESS FULL| T1 | 10000 | 30000 | 9 | |* 5 | SORT JOIN | | 10000 | 30000 | 49 | | 6 | TABLE ACCESS FULL| T2 | 10000 | 30000 | 9 | -------------------------------------------------------------------- Now we know it's the CBO creating this plan, as both the hinted and un-hinted queries produce the same execution path. All of this was run on Oracle 9.2.0.6, on Solaris 8. I am not aware of any CBO anomalies in 9.2.0.6, and I haven't found time nor space to install 10g. And, I'm hoping Jonathan Lewis can shed some light on these plans. I do not know the Oracle internals that well. David Fitzjarrell |
| |||
| Hi David, Sincere apologies for duplicate posting. Thanks a lot for helping me out!! However, there is another aspect of the execution trace which I could not understand. I have the trace output of the same query as earlier pasted below. When I checked the "time" values in the "Row source operation" of the execution plan, I find that the time value of the merge phase is much higher than the value in the sort aggregate. Does that mean that the "merge time" is the projected time had the merge operation been materialised instead of just being pipelined to be aggregated? Or am I missing some thing? Hope to hear from you soon. best regards, Nagender. select /*+ ORDERED USE_MERGE (f1 f2)*/count(*) from test2 f2, test1 f1 where f1.a = f2.a call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.04 0.05 0 32 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.04 0.05 0 32 0 1 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 58 Rows Row Source Operation ------- --------------------------------------------------- 1 SORT AGGREGATE (cr=32 pr=0 pw=0 time=51752 us) 36000 MERGE JOIN (cr=32 pr=0 pw=0 time=159758 us) 3000 SORT JOIN (cr=16 pr=0 pw=0 time=15583 us) 3000 TABLE ACCESS FULL TEST2 (cr=16 pr=0 pw=0 time=3179 us) 36000 SORT JOIN (cr=16 pr=0 pw=0 time=43848 us) 3000 TABLE ACCESS FULL TEST1 (cr=16 pr=0 pw=0 time=3068 us) ************************************************** ****************************** |
| |||
| Carlos wrote: > >> Post your question ONCE, and in properl English... > > I think 'properl English' is not so 'proper'. > > Cheers. > > Carlos. My typo is far better than trying to use 'wot' as a word, which was the intent of the admonishment. Apparently you have nothing better to do than nitpick responses. I find NOTHING of any value to the original poster in your post. Or, did I miss something??? David Fitzjarrell |
| |||
| derek wrote: > Hi David, > Sincere apologies for duplicate posting. Thanks a lot for helping me > out!! However, there is another aspect of the execution trace which I > could not understand. I have the trace output of the same query as > earlier pasted below. When I checked the "time" values in the "Row > source operation" of the execution plan, I find that the time value of > the merge phase is much higher than the value in the sort aggregate. > Does that mean that the "merge time" is the projected time had the > merge operation been materialised instead of just being pipelined to be > aggregated? Or am I missing some thing? Hope to hear from you soon. > > best regards, > Nagender. > > > > > select /*+ ORDERED USE_MERGE (f1 f2)*/count(*) > from > test2 f2, test1 f1 where f1.a = f2.a > > > call count cpu elapsed disk query current > rows > ------- ------ -------- ---------- ---------- ---------- ---------- > ---------- > Parse 1 0.00 0.00 0 0 0 > 0 > Execute 1 0.00 0.00 0 0 0 > 0 > Fetch 2 0.04 0.05 0 32 0 > 1 > ------- ------ -------- ---------- ---------- ---------- ---------- > ---------- > total 4 0.04 0.05 0 32 0 > 1 > > Misses in library cache during parse: 0 > Optimizer mode: ALL_ROWS > Parsing user id: 58 > > Rows Row Source Operation > ------- --------------------------------------------------- > 1 SORT AGGREGATE (cr=32 pr=0 pw=0 time=51752 us) > 36000 MERGE JOIN (cr=32 pr=0 pw=0 time=159758 us) > 3000 SORT JOIN (cr=16 pr=0 pw=0 time=15583 us) > 3000 TABLE ACCESS FULL TEST2 (cr=16 pr=0 pw=0 time=3179 us) > 36000 SORT JOIN (cr=16 pr=0 pw=0 time=43848 us) > 3000 TABLE ACCESS FULL TEST1 (cr=16 pr=0 pw=0 time=3068 us) > > ************************************************** ****************************** This is tkprof output, generated from a session trace, where the query was actually executed and a result set returned, so no 'projected' values are present. All times reported, afaik, are actual elapsed times. Note, however, the time in the explain plan portion is in microseconds, and the times reported by tkprof in the summary section are in 1/100ths of a second. The MERGE JOIN consumed 159758 microseconds; this converts to roughly 160 milliseconds or 0.16 seconds (roundng up). The CPU time for this same query was only 0.04 seconds, so it would appear you were waiting for a resource for the merge join. I can't say for certain as I don't have the trace file to examine, but I wonder what value you've set for your sort_area_size. And I did notice this is not the same query you originally posted, as you're now using an equi-join. Jonathan Lewis and Tom Kyte cover reading and interpreting this output in their texts. Also, since you're concerned with what the optimiser is actually doing, you might want to set event 10053 at level 1 for a dump of the optimiser calculations/operations for this same query. Between the two trace files you should be able to interpret what Oracle is doing with your query, and, hopefully, why it's choosing the operations it's reporting. David Fitzjarrell |
| ||||
| The system hasn't sorted multiple times. Your assumption about how the mechanism ought to work is correct, and does describe what happens. You can check this by looking at you session stats, or doing an autotrace. You will find that you have done 2 sorts, and sorted 20,000 rows. For your example, Oracle sorts both inputs, and writes the second output to 'TEMP'. Then for each row in the first output it will find a start and end point in the second output and join. The 49995000 beside the second sort line: > 49995000 SORT JOIN (cr=38 pr=0 pw=0 time=50031081 us) is the number of rows supplied as a result of that sorted row source - but Oracle has only sorted it once, than kept 're-supplying' the required results. Ideally there ought to be a line which is the parent of the sort operation that is an 'ordered selection output' line (or something similar) so that you could see that the sort generated 10,000 rows, and then it's output was scanned in order many times. -- Regards Jonathan Lewis http://www.jlcomp.demon.co.uk/faq/ind_faq.html The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/seminar.html Public Appearances - schedule updated April 5th 2005 <dereksmi@gmail.com> wrote in message news:1117586398.259896.157200@g14g2000cwa.googlegr oups.com... > > > Hi, > > I have two tables t1 and t2 containing 10K tuples each. I tried to do a > non equi-join on these tables using the following query. > > select /*+ ORDERED USE_MERGE (t2 t1)*/count(*) from test2 t2, test1 t1 > where t2.a > t1.a; > > A simple implementaion of Sort merge based non equi join would have > sorted both relations and iterated through the outer relation returning > all the matching tuples. However when I checked the execution plan made > by Oracle. this is wot it shows > > > Rows Row Source Operation > ------- --------------------------------------------------- > 1 SORT AGGREGATE (cr=76 pr=0 pw=0 time=26809424 us) > 49995000 MERGE JOIN (cr=76 pr=0 pw=0 time=150014986 us) > 10000 SORT JOIN (cr=38 pr=0 pw=0 time=26064 us) > 10000 TABLE ACCESS FULL TEST2 (cr=38 pr=0 pw=0 time=10086 us) > 49995000 SORT JOIN (cr=38 pr=0 pw=0 time=50031081 us) > 10000 TABLE ACCESS FULL TEST1 (cr=38 pr=0 pw=0 time=10055 us) > > > Why does it sort the inner relation multiple times? Is there a way to > avoid it? Any help would be greatly appreciated!! > > best regards, > Derek. > |