This is a discussion on Avoiding Table Scan on Update with correlated subselect within the DB2 forums, part of the Database Server Software category; --> Query: update table1 t1 set end_time = ( select end_time from table2 t2 where t2.key1 = t1.key1 and t2.key2 ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Query: update table1 t1 set end_time = ( select end_time from table2 t2 where t2.key1 = t1.key1 and t2.key2 = t1.key2 ) where exists ( select 1 from table2 t2 where t2.key1 = t1.key1 and t2.key2 = t2.key2 ) T1 and T2 share the same primary key. T2 is a volatile table that has zero or hundreds of rows in it. T1 is large table with millions of rows. DB2 explain facility shows table scan on large table. I need this to be fast to avoid concurrency/locking issues. I'm using the full primary key in the first correlated subselect, and in the "exists" clause, and am confused why the optimizer would choose to do a table scan. Is there a better way to write this to avoid table scans? Platform is 7.2 Fixpack 11. Thanks, Mike |
| ||||
| Greatest concurrency occurs with minimal locking. Locking is minimized when only a single row at a time is locked. Since you are updating only "a few hundred rows" and T2 is "a volatile table" I'd assume that you will be deleting the contents of T2 when finished and that T2 can't be updated during your processing. You don't state where you are interested on minimizing locking - T1 or T2. A stored procedure with the following logic will do this. 1. Construct a parameterized statement to update T1. 2. Construct a cursor to read T2 - WITH HOLD clause will be needed. 3. Lock T2. 4. For each row read from T2 5. Update T1 6. Commit T1 5. Delete rows from T2. The updates will be done using the primary key which should lock a single row of T1. Commits will release the locks immediately after the update, minimizing T1's locked time. The lock on T2 will prevent updates during your use of the table. Taking a commit after each update of T1 is not the best performer but does maximize concurrency on T1. A compromise of locking and performance is to maintain a rows updated counter and commit after every n updates. Phil Sherman Mike L. Bell wrote: > Query: > > update table1 t1 > set end_time = ( > select end_time > from table2 t2 > where t2.key1 = t1.key1 > and t2.key2 = t1.key2 > ) > where exists > ( > select 1 > from table2 t2 > where t2.key1 = t1.key1 > and t2.key2 = t2.key2 > ) > > > T1 and T2 share the same primary key. T2 is a volatile table that has > zero or hundreds of rows in it. T1 is large table with millions of > rows. > > DB2 explain facility shows table scan on large table. I need this to > be fast to avoid concurrency/locking issues. I'm using the full > primary key in the first correlated subselect, and in the "exists" > clause, and am confused why the optimizer would choose to do a table > scan. > > Is there a better way to write this to avoid table scans? > > Platform is 7.2 Fixpack 11. > > Thanks, > Mike |