This is a discussion on Better index access = worse performance?? within the DB2 forums, part of the Database Server Software category; --> Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Helpful folks, Most of my previous experience with DB2 was on s390 mainframe systems and the optimizer on this platform always seemed very predictable and consistent. Since moving to a WinNT/UDB 7.2 environment, the choices the optimizer makes often seem flaky. But this last example really floored me. I was hoping someone could explain why I get worse response time when the optimizer uses two indexes, than when it uses one. Some context: Orders table: 5 million+ rows This table contains CLIENTID - integer column identifying the the client placing the order SEARCHCODE - two-character column defining a flavor of order CLIENTID has it's own dedicated index, ORDERS02, and SEARCHCODE is the high-order column of a three-column index, ORDERS13.(And yes, there are 13 indexes on this %#$@!! table). The cardinality of SEARCHCODE is extremely low, with roughly 10 distinct values spread across 5 mil rows. The cardinality of CLIENTID is relatively high (and very high in relation to SEARCHCODE), with thousands of distinct values. When I EXPLAIN this SQL statement: Select Distinct DATA2 from ORDERS where SEARCHCODE = '04' and CLIENTID = 277475 With UR I'm told that both indexes are used to honor the query, and the timeron resource number is very low. There are actually 7 rows that match the criteria in the WHERE clause, and two are returned after the DISTINCT is employed. However, clock response time is over 3 minutes, which is unacceptable. When I modify the query to avoid using the index on SEARCHCODE, using a technique described by Bonnie Baker, the EXPLAIN does indeed show only the CLIENTID index being used, and response time is under 10 seconds: Select Distinct DATA2 from ORDERS where SEARCHCODE = '04' || '' and CLIENTID = 277475 With UR Why should this be? If the optimizer can determine that two indexes would be useful, why can't it use them efficiently? Surely it has all of the distribution statistics available to see the cardinality of the SEARCHCODE index? And yes, runstats was run 4 days previously, and then again just before retesting, using this command: RUNSTATS ON TABLE DB2ADMIN.ORDERS WITH DISTRIBUTION AND DETAILED INDEXES ALL SHRLEVEL CHANGE I've included the output of both explains below. Can anyone shed some light on this? Would a different index arrangement improve this situation such that the optimizer would not make such poor choices? Any feedback would be appreciated. Sean Package Name = DB2ADMIN.DYNEXPLN Prep Date = 2004/02/06 Prep Time = 10:59:53 Bind Timestamp = 2004-02-06-11.09.04.781000 Isolation Level = Cursor Stability Blocking = Block Unambiguous Cursors Query Optimization Class = 5 Partition Parallel = No Intra-Partition Parallel = No Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" -------------------- SECTION --------------------------------------- Section = 1 SQL Statement: Select Distinct DATA2 from ORDERS where SEARCHCODE = '04' and CLIENTID = 277475 With UR Estimated Cost = 254 Estimated Cardinality = 3 ( 5) Index ANDing | Optimizer Estimate of Set Size: 3 | Index ANDing Bitmap Build | | Optimizer Estimate of Set Size: 3 ( 6) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 | | | #Columns = 1 | | | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 | | | | Index Columns: | | | | | 1: CLIENTID (Ascending) | | | | #Key Columns = 1 | | | | | Start Key: Inclusive Value | | | | | | 1: 277475 | | | | | Stop Key: Inclusive Value | | | | | | 1: 277475 | | | | Index-Only Access | | | | Index Prefetch: None | | | Isolation Level: Uncommitted Read | | | Lock Intents | | | | Table: Intent None | | | | Row : None | Index ANDing Bitmap Probe | | Optimizer Estimate of Set Size: 3 ( 7) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 | | | #Columns = 1 | | | Index Scan: Name = DB2ADMIN.ORDERS13 ID = 14 | | | | Index Columns: | | | | | 1: SEARCHCODE (Ascending) | | | | | 2: STATUS (Ascending) | | | | | 3: TNUMBER (Ascending) | | | | #Key Columns = 1 | | | | | Start Key: Inclusive Value | | | | | | 1: '04' | | | | | Stop Key: Inclusive Value | | | | | | 1: '04' | | | | Index-Only Access | | | | Index Prefetch: Eligible 59 | | | Isolation Level: Uncommitted Read | | | Lock Intents | | | | Table: Intent None | | | | Row : None ( 4) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 | #Columns = 3 | Fetch Direct | Isolation Level: Uncommitted Read | Lock Intents | | Table: Intent None | | Row : None | Residual Predicate(s) | | #Predicates = 2 ( 3) Insert Into Sorted Temp Table ID = t1 | #Columns = 1 | #Sort Key Columns = 1 | | Key 1: DATA2 (Ascending) | Sortheap Allocation Parameters: | | #Rows = 4 | | Row Width = 16 | Piped | Duplicate Elimination ( 2) Access Temp Table ID = t1 | #Columns = 1 | Relation Scan | | Prefetch: Eligible ( 2) | Return Data to Application | | #Columns = 1 ( 1) Return Data Completion End of section Optimizer Plan: RETURN ( 1) | TBSCAN ( 2) | SORT ( 3) | FETCH ( 4) /--/ \ IXAND Table: ( 5) DB2ADMIN /--/ \--\ ORDERS IXSCAN IXSCAN ( 6) ( 7) / \ / \ Index: Table: Index: Table: DB2ADMIN DB2ADMIN DB2ADMIN DB2ADMIN ORDERS02 ORDERS ORDERS13 ORDERS DB2 Universal Database Version 7.2, 5622-044 (c) Copyright IBM Corp. 1991, 2001 Licensed Material - Program Property of IBM IBM DATABASE 2 SQL Explain Tool ******************** PACKAGE *************************************** Package Name = DB2ADMIN.DYNEXPLN Prep Date = 2004/02/06 Prep Time = 11:04:05 Bind Timestamp = 2004-02-06-11.13.16.859000 Isolation Level = Cursor Stability Blocking = Block Unambiguous Cursors Query Optimization Class = 5 Partition Parallel = No Intra-Partition Parallel = No Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" -------------------- SECTION --------------------------------------- Section = 1 SQL Statement: Select Distinct DATA2 from ORDERS where SEARCHCODE = '04' || '' and CLIENTID = 277475 With UR Estimated Cost = 274 Estimated Cardinality = 6 ( 7) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 | #Columns = 1 | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 | | Index Columns: | | | 1: CLIENTID (Ascending) | | #Key Columns = 1 | | | Start Key: Inclusive Value | | | | 1: 277475 | | | Stop Key: Inclusive Value | | | | 1: 277475 | | Index-Only Access | | Index Prefetch: None ( 7) | | | Insert Into Sorted Temp Table ID = t1 | | | | #Columns = 1 | | | | #Sort Key Columns = 1 | | | | | Key 1: (Ascending) | | | | Sortheap Allocation Parameters: | | | | | #Rows = 410 | | | | | Row Width = 12 | | | | Piped | Isolation Level: Uncommitted Read | Lock Intents | | Table: Intent None | | Row : None ( 6) Sorted Temp Table Completion ID = t1 ( 5) List Prefetch RID Preparation ( 3) Insert Into Sorted Temp Table ID = t2 | #Columns = 1 | #Sort Key Columns = 1 | | Key 1: DATA2 (Ascending) | Sortheap Allocation Parameters: | | #Rows = 6 | | Row Width = 16 | Piped | Duplicate Elimination ( 2) Access Temp Table ID = t2 | #Columns = 1 | Relation Scan | | Prefetch: Eligible ( 2) | Return Data to Application | | #Columns = 1 ( 1) Return Data Completion End of section Optimizer Plan: RETURN ( 1) | TBSCAN ( 2) | SORT ( 3) | RIDSCN ( 5) | SORT ( 6) | IXSCAN ( 7) / \ Index: Table: DB2ADMIN DB2ADMIN ORDERS02 ORDERS |
| |||
| "Sean C." <dba_fla@yahoo.com> wrote in message news:2f9c2d92.0402061427.6b094825@posting.google.c om... > Helpful folks, > > Most of my previous experience with DB2 was on s390 mainframe systems > and the optimizer on this platform always seemed very predictable and > consistent. Since moving to a WinNT/UDB 7.2 environment, the choices > the optimizer makes often seem flaky. But this last example really > floored me. I was hoping someone could explain why I get worse > response time when the optimizer uses two indexes, than when it uses > one. Some context: > > Orders table: 5 million+ rows > This table contains > CLIENTID - integer column identifying the the client placing the > order > SEARCHCODE - two-character column defining a flavor of order > CLIENTID has it's own dedicated index, ORDERS02, and SEARCHCODE is the > high-order column of a three-column index, ORDERS13.(And yes, there > are 13 indexes on this %#$@!! table). > The cardinality of SEARCHCODE is extremely low, with roughly 10 > distinct values spread across 5 mil rows. > The cardinality of CLIENTID is relatively high (and very high in > relation to SEARCHCODE), with thousands of distinct values. > > When I EXPLAIN this SQL statement: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' and CLIENTID = 277475 > With UR > > I'm told that both indexes are used to honor the query, and the > timeron resource number is very low. There are actually 7 rows that > match the criteria in the WHERE clause, and two are returned after the > DISTINCT is employed. However, clock response time is over 3 minutes, > which is unacceptable. > > > When I modify the query to avoid using the index on SEARCHCODE, using > a technique described by Bonnie Baker, the EXPLAIN does indeed show > only the CLIENTID index being used, and response time is under 10 > seconds: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' || '' and CLIENTID = 277475 > With UR > > > Why should this be? If the optimizer can determine that two indexes > would be useful, why can't it use them efficiently? Surely it has all > of the distribution statistics available to see the cardinality of the > SEARCHCODE index? And yes, runstats was run 4 days previously, and > then again just before retesting, using this command: > RUNSTATS ON TABLE DB2ADMIN.ORDERS WITH DISTRIBUTION AND DETAILED > INDEXES ALL SHRLEVEL CHANGE > > I've included the output of both explains below. > Can anyone shed some light on this? Would a different index > arrangement improve this situation such that the optimizer would not > make such poor choices? > > Any feedback would be appreciated. > > Sean > When you execute the runstats utility it would be preferable to use the following options: RUNSTATS ON TABLE table_name WITH DISTRIBUTION ON ALL COLUMNS AND DETAILED INDEXES ALL This will tell DB2 about the low cardinality of the SEARCHCODE. For some insane reason (I think IBM is sometimes suicidal), none of the examples in the Command Reference shows using all the above options. It should be the default. This will tell DB2 about the low cardinality of the SEARCHCODE. Another thing you could try is change the query optimization level. The default is usually set to 5, but sometimes 7 does better. |
| |||
| To remove the index anding, you have to lower the opt level... You may see sort problems (sheapthres, sortheap) if it's trying index anding on 500,000 rows/searchcode. (5M/10) There is obviously something 'wrong' with the stats. SEARCHCODE doesn't have a good filter factor. We know but the optimizer doesn't seem to know. gui explain should provide more info about the perceived cost of each operation. In explain no 2, there is a sort allocation of 410 rows. (for an equality predicate) 5M/400 +- = 12500 Do you really have +- 400 records per CLIENTID? Do you have roughly 12500 CLIENTID's? (possible from what you said) When something weird happens, see the sysstat.* tables and try to find what obvious things the optimizer may not see. Mark A did the research on how to fix the sysstat data so i won't redo it. PM |
| |||
| In article <uF0Vb.42044$9U5.1579668@news20.bellglobal.com>, ("PM \(pm3iinc-nospam\)" <PM(pm3iinc-nospam)@sympatico.ca>) says... > To remove the index anding, you have to lower the opt level... > > > You may see sort problems (sheapthres, sortheap) if it's trying index anding > on 500,000 rows/searchcode. (5M/10) > There is obviously something 'wrong' with the stats. > SEARCHCODE doesn't have a good filter factor. We know but the optimizer > doesn't seem to know. > gui explain should provide more info about the perceived cost of each > operation. > > In explain no 2, there is a sort allocation of 410 rows. (for an equality > predicate) > 5M/400 +- = 12500 Do you really have +- 400 records per CLIENTID? > Do you have roughly 12500 CLIENTID's? (possible from what you said) > > When something weird happens, see the sysstat.* tables and try to find > what obvious things the optimizer may not see. > > > Mark A did the research on how to fix the sysstat data so i won't redo it. > > PM > > > Another trick that changes the search path is to add the 'optimize for x rows' option. If the queries are executed from a client it might help to specify the number of rows which fits in one requester block (DBM CFG --> RQRIOBLK), although I'm not sure if it still makes any difference. Most likely it depends on the network speed. |
| |||
| dba_fla@yahoo.com (Sean C.) wrote in message news:<2f9c2d92.0402061427.6b094825@posting.google. com>... > I've included the output of both explains below. > Can anyone shed some light on this? Would a different index > arrangement improve this situation such that the optimizer would not > make such poor choices? db2exfmt of the problem query would give much more detail, including cardinality estimates at each step, which might help us see better what DB2 thinks it's doing. DG |
| |||
| "AK" <ak_tiredofspam@yahoo.com> wrote in message news:46e627da.0402070629.76367a6d@posting.google.c om... > replacing an index on (CLIENTID) with > an index on (CLIENTID, SEARCHCODE) might help, and yes, this is an > expensive solution > > P.S. Oracle's bitmap indexes might be very useful in this situation DB2 will sometimes use bitmaps (automatically) when multiple indexes are used. In this situation, better statistics are the answer. |
| |||
| Very good observation ... I don't know the exact cardinality of CLIENTID. This is one of the most importants thing the optimizer will determine to use INDEX (ONLY) SCAN or use INDEX ANDing for this case. If I were you, I would - Apply the latest FixPak for UDB Version 7.2. - Runstats again (full collect, not updatable collect) - Capture the access plan and compare them. (Please don't change dbm/db cfg, optimization level ...) - Run both queries ... - If it is still the same problem, please CONTACT IBM Support ... "Sean C." <dba_fla@yahoo.com> wrote in message news:2f9c2d92.0402061427.6b094825@posting.google.c om... > Helpful folks, > > Most of my previous experience with DB2 was on s390 mainframe systems > and the optimizer on this platform always seemed very predictable and > consistent. Since moving to a WinNT/UDB 7.2 environment, the choices > the optimizer makes often seem flaky. But this last example really > floored me. I was hoping someone could explain why I get worse > response time when the optimizer uses two indexes, than when it uses > one. Some context:. > > Orders table: 5 million+ rows > This table contains > CLIENTID - integer column identifying the the client placing the > order > SEARCHCODE - two-character column defining a flavor of order > CLIENTID has it's own dedicated index, ORDERS02, and SEARCHCODE is the > high-order column of a three-column index, ORDERS13.(And yes, there > are 13 indexes on this %#$@!! table). > The cardinality of SEARCHCODE is extremely low, with roughly 10 > distinct values spread across 5 mil rows. > The cardinality of CLIENTID is relatively high (and very high in > relation to SEARCHCODE), with thousands of distinct values. > > When I EXPLAIN this SQL statement: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' and CLIENTID = 277475 > With UR > > I'm told that both indexes are used to honor the query, and the > timeron resource number is very low. There are actually 7 rows that > match the criteria in the WHERE clause, and two are returned after the > DISTINCT is employed. However, clock response time is over 3 minutes, > which is unacceptable. > > > When I modify the query to avoid using the index on SEARCHCODE, using > a technique described by Bonnie Baker, the EXPLAIN does indeed show > only the CLIENTID index being used, and response time is under 10 > seconds: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' || '' and CLIENTID = 277475 > With UR > > > Why should this be? If the optimizer can determine that two indexes > would be useful, why can't it use them efficiently? Surely it has all > of the distribution statistics available to see the cardinality of the > SEARCHCODE index? And yes, runstats was run 4 days previously, and > then again just before retesting, using this command: > RUNSTATS ON TABLE DB2ADMIN.ORDERS WITH DISTRIBUTION AND DETAILED > INDEXES ALL SHRLEVEL CHANGE > > I've included the output of both explains below. > Can anyone shed some light on this? Would a different index > arrangement improve this situation such that the optimizer would not > make such poor choices? > > Any feedback would be appreciated. > > Sean > > > > > > Package Name = DB2ADMIN.DYNEXPLN > Prep Date = 2004/02/06 > Prep Time = 10:59:53 > > Bind Timestamp = 2004-02-06-11.09.04.781000 > > Isolation Level = Cursor Stability > Blocking = Block Unambiguous Cursors > Query Optimization Class = 5 > > Partition Parallel = No > Intra-Partition Parallel = No > > Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" > > -------------------- SECTION --------------------------------------- > Section = 1 > > > SQL Statement: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' and CLIENTID = 277475 > With UR > > > Estimated Cost = 254 > Estimated Cardinality = 3 > > ( 5) Index ANDing > | Optimizer Estimate of Set Size: 3 > | Index ANDing Bitmap Build > | | Optimizer Estimate of Set Size: 3 > ( 6) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > | | | #Columns = 1 > | | | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 > | | | | Index Columns: > | | | | | 1: CLIENTID (Ascending) > | | | | #Key Columns = 1 > | | | | | Start Key: Inclusive Value > | | | | | | 1: 277475 > | | | | | Stop Key: Inclusive Value > | | | | | | 1: 277475 > | | | | Index-Only Access > | | | | Index Prefetch: None > | | | Isolation Level: Uncommitted Read > | | | Lock Intents > | | | | Table: Intent None > | | | | Row : None > | Index ANDing Bitmap Probe > | | Optimizer Estimate of Set Size: 3 > ( 7) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > | | | #Columns = 1 > | | | Index Scan: Name = DB2ADMIN.ORDERS13 ID = 14 > | | | | Index Columns: > | | | | | 1: SEARCHCODE (Ascending) > | | | | | 2: STATUS (Ascending) > | | | | | 3: TNUMBER (Ascending) > | | | | #Key Columns = 1 > | | | | | Start Key: Inclusive Value > | | | | | | 1: '04' > | | | | | Stop Key: Inclusive Value > | | | | | | 1: '04' > | | | | Index-Only Access > | | | | Index Prefetch: Eligible 59 > | | | Isolation Level: Uncommitted Read > | | | Lock Intents > | | | | Table: Intent None > | | | | Row : None > ( 4) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > | #Columns = 3 > | Fetch Direct > | Isolation Level: Uncommitted Read > | Lock Intents > | | Table: Intent None > | | Row : None > | Residual Predicate(s) > | | #Predicates = 2 > ( 3) Insert Into Sorted Temp Table ID = t1 > | #Columns = 1 > | #Sort Key Columns = 1 > | | Key 1: DATA2 (Ascending) > | Sortheap Allocation Parameters: > | | #Rows = 4 > | | Row Width = 16 > | Piped > | Duplicate Elimination > ( 2) Access Temp Table ID = t1 > | #Columns = 1 > | Relation Scan > | | Prefetch: Eligible > ( 2) | Return Data to Application > | | #Columns = 1 > ( 1) Return Data Completion > > End of section > > > Optimizer Plan: > > RETURN > ( 1) > | > TBSCAN > ( 2) > | > SORT > ( 3) > | > FETCH > ( 4) > /--/ \ > IXAND Table: > ( 5) DB2ADMIN > /--/ \--\ ORDERS > IXSCAN IXSCAN > ( 6) ( 7) > / \ / \ > Index: Table: Index: Table: > DB2ADMIN DB2ADMIN DB2ADMIN DB2ADMIN > ORDERS02 ORDERS ORDERS13 ORDERS > > > > > DB2 Universal Database Version 7.2, 5622-044 (c) Copyright IBM Corp. > 1991, 2001 > Licensed Material - Program Property of IBM > IBM DATABASE 2 SQL Explain Tool > > ******************** PACKAGE *************************************** > > Package Name = DB2ADMIN.DYNEXPLN > Prep Date = 2004/02/06 > Prep Time = 11:04:05 > > Bind Timestamp = 2004-02-06-11.13.16.859000 > > Isolation Level = Cursor Stability > Blocking = Block Unambiguous Cursors > Query Optimization Class = 5 > > Partition Parallel = No > Intra-Partition Parallel = No > > Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" > > -------------------- SECTION --------------------------------------- > Section = 1 > > > SQL Statement: > > Select Distinct DATA2 > from ORDERS > where SEARCHCODE = '04' || '' and CLIENTID = 277475 > With UR > > > Estimated Cost = 274 > Estimated Cardinality = 6 > > ( 7) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > | #Columns = 1 > | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 > | | Index Columns: > | | | 1: CLIENTID (Ascending) > | | #Key Columns = 1 > | | | Start Key: Inclusive Value > | | | | 1: 277475 > | | | Stop Key: Inclusive Value > | | | | 1: 277475 > | | Index-Only Access > | | Index Prefetch: None > ( 7) | | | Insert Into Sorted Temp Table ID = t1 > | | | | #Columns = 1 > | | | | #Sort Key Columns = 1 > | | | | | Key 1: (Ascending) > | | | | Sortheap Allocation Parameters: > | | | | | #Rows = 410 > | | | | | Row Width = 12 > | | | | Piped > | Isolation Level: Uncommitted Read > | Lock Intents > | | Table: Intent None > | | Row : None > ( 6) Sorted Temp Table Completion ID = t1 > ( 5) List Prefetch RID Preparation > ( 3) Insert Into Sorted Temp Table ID = t2 > | #Columns = 1 > | #Sort Key Columns = 1 > | | Key 1: DATA2 (Ascending) > | Sortheap Allocation Parameters: > | | #Rows = 6 > | | Row Width = 16 > | Piped > | Duplicate Elimination > ( 2) Access Temp Table ID = t2 > | #Columns = 1 > | Relation Scan > | | Prefetch: Eligible > ( 2) | Return Data to Application > | | #Columns = 1 > ( 1) Return Data Completion > > End of section > > > Optimizer Plan: > > RETURN > ( 1) > | > TBSCAN > ( 2) > | > SORT > ( 3) > | > RIDSCN > ( 5) > | > SORT > ( 6) > | > IXSCAN > ( 7) > / \ > Index: Table: > DB2ADMIN DB2ADMIN > ORDERS02 ORDERS |
| |||
| Just think of another point: Since you said - > > The cardinality of CLIENTID is relatively high (and very high in > > relation to SEARCHCODE), with thousands of distinct values. The card(CLIENTID) is RELATIVELY HIGH ... and another case is that it is possible "count(*) where CLIENTID = 277475" far less than "COUNT(*) where clientid = #####" It is impossible for db2 optimizer that it will choose use one index (only) scan for searching clientid=277475 and choose index anding for searching clientid=###### - The same thing for DB2 optimizer on OS/390. "Fan Ruo Xin" <fanruox@sbcglobal.net> wrote in message news:UiaVb.1821$PY.1013@newssvr26.news.prodigy.com ... > Very good observation ... > > I don't know the exact cardinality of CLIENTID. This is one of the most > importants thing the optimizer will determine to use INDEX (ONLY) SCAN or > use INDEX ANDing for this case. > > If I were you, I would > - Apply the latest FixPak for UDB Version 7.2. > - Runstats again (full collect, not updatable collect) > - Capture the access plan and compare them. (Please don't change dbm/db cfg, > optimization level ...) > - Run both queries ... > - If it is still the same problem, please CONTACT IBM Support ... > > > "Sean C." <dba_fla@yahoo.com> wrote in message > news:2f9c2d92.0402061427.6b094825@posting.google.c om... > > Helpful folks, > > > > Most of my previous experience with DB2 was on s390 mainframe systems > > and the optimizer on this platform always seemed very predictable and > > consistent. Since moving to a WinNT/UDB 7.2 environment, the choices > > the optimizer makes often seem flaky. But this last example really > > floored me. I was hoping someone could explain why I get worse > > response time when the optimizer uses two indexes, than when it uses > > one. Some context:. > > > > Orders table: 5 million+ rows > > This table contains > > CLIENTID - integer column identifying the the client placing the > > order > > SEARCHCODE - two-character column defining a flavor of order > > CLIENTID has it's own dedicated index, ORDERS02, and SEARCHCODE is the > > high-order column of a three-column index, ORDERS13.(And yes, there > > are 13 indexes on this %#$@!! table). > > The cardinality of SEARCHCODE is extremely low, with roughly 10 > > distinct values spread across 5 mil rows. > > The cardinality of CLIENTID is relatively high (and very high in > > relation to SEARCHCODE), with thousands of distinct values. > > > > When I EXPLAIN this SQL statement: > > > > Select Distinct DATA2 > > from ORDERS > > where SEARCHCODE = '04' and CLIENTID = 277475 > > With UR > > > > I'm told that both indexes are used to honor the query, and the > > timeron resource number is very low. There are actually 7 rows that > > match the criteria in the WHERE clause, and two are returned after the > > DISTINCT is employed. However, clock response time is over 3 minutes, > > which is unacceptable. > > > > > > When I modify the query to avoid using the index on SEARCHCODE, using > > a technique described by Bonnie Baker, the EXPLAIN does indeed show > > only the CLIENTID index being used, and response time is under 10 > > seconds: > > > > Select Distinct DATA2 > > from ORDERS > > where SEARCHCODE = '04' || '' and CLIENTID = 277475 > > With UR > > > > > > Why should this be? If the optimizer can determine that two indexes > > would be useful, why can't it use them efficiently? Surely it has all > > of the distribution statistics available to see the cardinality of the > > SEARCHCODE index? And yes, runstats was run 4 days previously, and > > then again just before retesting, using this command: > > RUNSTATS ON TABLE DB2ADMIN.ORDERS WITH DISTRIBUTION AND DETAILED > > INDEXES ALL SHRLEVEL CHANGE > > > > I've included the output of both explains below. > > Can anyone shed some light on this? Would a different index > > arrangement improve this situation such that the optimizer would not > > make such poor choices? > > > > Any feedback would be appreciated. > > > > Sean > > > > > > > > > > > > Package Name = DB2ADMIN.DYNEXPLN > > Prep Date = 2004/02/06 > > Prep Time = 10:59:53 > > > > Bind Timestamp = 2004-02-06-11.09.04.781000 > > > > Isolation Level = Cursor Stability > > Blocking = Block Unambiguous Cursors > > Query Optimization Class = 5 > > > > Partition Parallel = No > > Intra-Partition Parallel = No > > > > Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" > > > > -------------------- SECTION --------------------------------------- > > Section = 1 > > > > > > SQL Statement: > > > > Select Distinct DATA2 > > from ORDERS > > where SEARCHCODE = '04' and CLIENTID = 277475 > > With UR > > > > > > Estimated Cost = 254 > > Estimated Cardinality = 3 > > > > ( 5) Index ANDing > > | Optimizer Estimate of Set Size: 3 > > | Index ANDing Bitmap Build > > | | Optimizer Estimate of Set Size: 3 > > ( 6) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > > | | | #Columns = 1 > > | | | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 > > | | | | Index Columns: > > | | | | | 1: CLIENTID (Ascending) > > | | | | #Key Columns = 1 > > | | | | | Start Key: Inclusive Value > > | | | | | | 1: 277475 > > | | | | | Stop Key: Inclusive Value > > | | | | | | 1: 277475 > > | | | | Index-Only Access > > | | | | Index Prefetch: None > > | | | Isolation Level: Uncommitted Read > > | | | Lock Intents > > | | | | Table: Intent None > > | | | | Row : None > > | Index ANDing Bitmap Probe > > | | Optimizer Estimate of Set Size: 3 > > ( 7) | | Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > > | | | #Columns = 1 > > | | | Index Scan: Name = DB2ADMIN.ORDERS13 ID = 14 > > | | | | Index Columns: > > | | | | | 1: SEARCHCODE (Ascending) > > | | | | | 2: STATUS (Ascending) > > | | | | | 3: TNUMBER (Ascending) > > | | | | #Key Columns = 1 > > | | | | | Start Key: Inclusive Value > > | | | | | | 1: '04' > > | | | | | Stop Key: Inclusive Value > > | | | | | | 1: '04' > > | | | | Index-Only Access > > | | | | Index Prefetch: Eligible 59 > > | | | Isolation Level: Uncommitted Read > > | | | Lock Intents > > | | | | Table: Intent None > > | | | | Row : None > > ( 4) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > > | #Columns = 3 > > | Fetch Direct > > | Isolation Level: Uncommitted Read > > | Lock Intents > > | | Table: Intent None > > | | Row : None > > | Residual Predicate(s) > > | | #Predicates = 2 > > ( 3) Insert Into Sorted Temp Table ID = t1 > > | #Columns = 1 > > | #Sort Key Columns = 1 > > | | Key 1: DATA2 (Ascending) > > | Sortheap Allocation Parameters: > > | | #Rows = 4 > > | | Row Width = 16 > > | Piped > > | Duplicate Elimination > > ( 2) Access Temp Table ID = t1 > > | #Columns = 1 > > | Relation Scan > > | | Prefetch: Eligible > > ( 2) | Return Data to Application > > | | #Columns = 1 > > ( 1) Return Data Completion > > > > End of section > > > > > > Optimizer Plan: > > > > RETURN > > ( 1) > > | > > TBSCAN > > ( 2) > > | > > SORT > > ( 3) > > | > > FETCH > > ( 4) > > /--/ \ > > IXAND Table: > > ( 5) DB2ADMIN > > /--/ \--\ ORDERS > > IXSCAN IXSCAN > > ( 6) ( 7) > > / \ / \ > > Index: Table: Index: Table: > > DB2ADMIN DB2ADMIN DB2ADMIN DB2ADMIN > > ORDERS02 ORDERS ORDERS13 ORDERS > > > > > > > > > > DB2 Universal Database Version 7.2, 5622-044 (c) Copyright IBM Corp. > > 1991, 2001 > > Licensed Material - Program Property of IBM > > IBM DATABASE 2 SQL Explain Tool > > > > ******************** PACKAGE *************************************** > > > > Package Name = DB2ADMIN.DYNEXPLN > > Prep Date = 2004/02/06 > > Prep Time = 11:04:05 > > > > Bind Timestamp = 2004-02-06-11.13.16.859000 > > > > Isolation Level = Cursor Stability > > Blocking = Block Unambiguous Cursors > > Query Optimization Class = 5 > > > > Partition Parallel = No > > Intra-Partition Parallel = No > > > > Function Path = "SYSIBM", "SYSFUN", "DB2ADMIN" > > > > -------------------- SECTION --------------------------------------- > > Section = 1 > > > > > > SQL Statement: > > > > Select Distinct DATA2 > > from ORDERS > > where SEARCHCODE = '04' || '' and CLIENTID = 277475 > > With UR > > > > > > Estimated Cost = 274 > > Estimated Cardinality = 6 > > > > ( 7) Access Table Name = DB2ADMIN.ORDERS ID = 6,4 > > | #Columns = 1 > > | Index Scan: Name = DB2ADMIN.ORDERS02 ID = 3 > > | | Index Columns: > > | | | 1: CLIENTID (Ascending) > > | | #Key Columns = 1 > > | | | Start Key: Inclusive Value > > | | | | 1: 277475 > > | | | Stop Key: Inclusive Value > > | | | | 1: 277475 > > | | Index-Only Access > > | | Index Prefetch: None > > ( 7) | | | Insert Into Sorted Temp Table ID = t1 > > | | | | #Columns = 1 > > | | | | #Sort Key Columns = 1 > > | | | | | Key 1: (Ascending) > > | | | | Sortheap Allocation Parameters: > > | | | | | #Rows = 410 > > | | | | | Row Width = 12 > > | | | | Piped > > | Isolation Level: Uncommitted Read > > | Lock Intents > > | | Table: Intent None > > | | Row : None > > ( 6) Sorted Temp Table Completion ID = t1 > > ( 5) List Prefetch RID Preparation > > ( 3) Insert Into Sorted Temp Table ID = t2 > > | #Columns = 1 > > | #Sort Key Columns = 1 > > | | Key 1: DATA2 (Ascending) > > | Sortheap Allocation Parameters: > > | | #Rows = 6 > > | | Row Width = 16 > > | Piped > > | Duplicate Elimination > > ( 2) Access Temp Table ID = t2 > > | #Columns = 1 > > | Relation Scan > > | | Prefetch: Eligible > > ( 2) | Return Data to Application > > | | #Columns = 1 > > ( 1) Return Data Completion > > > > End of section > > > > > > Optimizer Plan: > > > > RETURN > > ( 1) > > | > > TBSCAN > > ( 2) > > | > > SORT > > ( 3) > > | > > RIDSCN > > ( 5) > > | > > SORT > > ( 6) > > | > > IXSCAN > > ( 7) > > / \ > > Index: Table: > > DB2ADMIN DB2ADMIN > > ORDERS02 ORDERS > > |
| ||||
| > > > > P.S. Oracle's bitmap indexes might be very useful in this situation > > DB2 will sometimes use bitmaps (automatically) when multiple indexes are > used. In this situation, better statistics are the answer. there seems to be a big performance difference between using a pre-existing Oracle's bitmap index and DB2 creating a similar structure on the fly nad discarding it after the query has completed |