This is a discussion on Optimizer assitance needed within the Oracle Database forums, part of the Database Server Software category; --> We have a table of ~ 1.25 billion rows which is very heavily inserted into (grows at over 40 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| We have a table of ~ 1.25 billion rows which is very heavily inserted into (grows at over 40 million rows a week), never (yet) deleted from and is regularly queried. (I know we should look into partitioning this one probably .. but that's for a later discussion). In the process of re-analyzed the table some things went 'bad'. (deleted old stats 1st) exec dbms_stats.gather_table_stats(ownname =>'IBS', tabname =>'SALES_DETAIL', method_opt =>'FOR COLUMNS TRANSACTION_TYPE SIZE 9', cascade =>TRUE, degree => 10, estimate_percent => 1) (Subsequently I started using .01 pct which is still 125K rows) I did this in test first (where ethe table is ~1 billion rows). I checked my most common queries and batch processes for slowdowns and didn't find any. I then ran the *same* gather statement in production and found no problems. Later the same day I got numerous complaint of 'slowness' and found that a certain query (below) from a remote system was using a full scan in production. I checked in test and it was using the PK index. Our temp space filled and things started getting really slow. To my knowledge the data is not skewed on any of the columns except those I specified. Prod and test are within 20% of each other in size. The test system is weaker on horsepower (8 cpu vs 32 I think) and resources - but it strikes me as odd it would, given the same statistic gathering call produce the correct plan whereas PROD did not. (FYI the full scan takes many minutes as compared to the index range in TEST). Location_id is a range of about 8000 values - roughly evenly distributed. Sales_date is also evenly distributed over the past 8 months and transaction type is skewed over 9 values (the bulk of the rows having 3 of the values) I hinted the query in prod and it ran very fast. I then re-gathered in PROD with 'ALL INDEXED COLUMNS' (or obviously 'ALL COLUMNS SIZE AUTO') and the problem went away in prod (correct plan now). The table: (PK is made up of the 6 NOT NULL columns - yes I wish they had made it a sequence... but oh well) Name Null? Type --------------------------------------------------------- LOCATION_ID NOT NULL NUMBER(5) PROCESS_DATE NOT NULL DATE SALES_DATE NOT NULL DATE TRANSACTION_TYPE NOT NULL VARCHAR2(10) SKU NOT NULL VARCHAR2(10) PRICE_POINT NOT NULL NUMBER(12,2) UNIT_CNT NUMBER(12) EXT_COST_AMT NUMBER(14,4) EXT_RETAIL_AMT NUMBER(12,2) CREATE_SOURCE VARCHAR2(30) CREATE_DATE DATE LAST_UPDATE_SOURCE VARCHAR2(30) LAST_UPDATE_DATE DATE The Query: SELECT various columns FROM sales_detail sd WHERE sd.location_id = 749 AND sd.sales_date = '26-JUN-2004' AND sd.transaction_type = 'S' GROUP BY location_id, sd.transaction_type --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 51 | 20387 | | 1 | SORT GROUP BY NOSORT | | 1 | 51 | 20387 | | 2 | TABLE ACCESS BY INDEX ROWID| SALES_DETAIL | 99050 | 4933K| 20387 | |* 3 | INDEX RANGE SCAN | PK_SALES_DETAIL | 39620 | | 19780 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("SD"."LOCATION_ID"=749 AND "SD"."SALES_DATE"='2004-06-26' AND "SD"."TRANSACTION_TYPE"='S') filter("SD"."SALES_DATE"='2004-06-26' AND "SD"."TRANSACTION_TYPE"='S') --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 99050 | 4933K| 21468 | | | | | 1 | SORT GROUP BY NOSORT| | 99050 | 4933K| 21468 | | | | |* 2 | TABLE ACCESS FULL | SALES_DETAIL | 99050 | 4933K| 21468 | 83,00 | P->S | QC (RAND) | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SD"."LOCATION_ID"=749 AND "SD"."SALES_DATE"='2004-06-26' AND "SD"."TRANSACTION_TYPE"='S') I don't know what else to include.. my chief question is what can I check to determine why PROD chose the plan it did vs TEST? The init.ora params are the same (except regarding #processors) Test is a clone of PROD only about a month old (stale). |
| |||
| On 3 Aug 2004 14:14:50 -0700, industrialstr@yahoo.com (Masterhit) wrote: >my chief question is what can I >check to determine why PROD chose the plan it did vs TEST? The output of the trace file generated by setting event 10053, level 1. This will dump CBO's reasoning. Outlines are, however probably a better way to go. -- Sybrand Bakker, Senior Oracle DBA |
| |||
| industrialstr@yahoo.com (Masterhit) wrote in message news:<f3b449ab.0408031314.761308bb@posting.google. com>... > In the process of re-analyzed the table some things went 'bad'. > > (deleted old stats 1st) > exec dbms_stats.gather_table_stats(ownname =>'IBS', tabname > =>'SALES_DETAIL', method_opt =>'FOR COLUMNS TRANSACTION_TYPE SIZE > 9', cascade =>TRUE, degree => 10, estimate_percent => 1) (Subsequently > I started using .01 pct which is still 125K rows) This doesn't help you now, but dbms_stats has built in CYA functionality with the opportunity to save existing stats to a table. > I don't know what else to include.. my chief question is what can I > check to determine why PROD chose the plan it did vs TEST? The > init.ora params are the same (except regarding #processors) Test is a > clone of PROD only about a month old (stale). It looks to me as though your prod system has chosen parallel query. I'm assuming that that is what you mean by you comment about #processors? is Parallel Query available on the test system? Oh and yes I'd have though a transaction history table of 1bn rows would be an excellent candidate for partitioning. Niall Litchfield Oracle DBA |
| |||
| Thanks for the responses. Yes TEST does have parallel processing turned on (actually all the init.ora etc. are identical with the exception of the # of processors). I should know better than to do anything in PROD without backing up the current stats- but it was my belief (error) that if it worked "this way" in TEST (which is soooo similar to PROD) then it will do the same thing in PROD. I don't usually go on 'should' but did this time and got bitten. The tracing is a good idea for next time.. it didn't occur to me (I am primarily a developer). Thanks again .. I will 'fiddle' with this in test some more. |
| ||||
| Thought I replied already.. unles there is a lag in post time... Thanks for the replies. We do have parallel features turned on in TEST and PROD. In fact the systems are identical in init params except for #CPUs. Setting the trace even would be a great idea next time..as would backing up my stats. I don't usually go on 'should' or 'probably' but this time I did and it bit me.. I expected the same plan from TEST in PROD given nearly identical stats/params/sizes.. but maybe the hardware or something did me in. |