This is a discussion on Query question within the DB2 forums, part of the Database Server Software category; --> Hi gurus, I have a problem to convert MS SQL Server application to DB2. I have a view which ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi gurus, I have a problem to convert MS SQL Server application to DB2. I have a view which combines 7 tables ( table sizes 60millions rows, 3 mill, 1 mill, other small ones) I use this view to populate OLAP cube and in SQL Server it takes abut 1 hour.. In DB2 it takes forever. Execution plan looks ugly - DB2 sorts !!!!! 50 mil table by field which is key field for 3 mill rows table. Indexes are in place, I updated statistics. When I added OPTIMAZE FOR 1 ROW DB2 change plan to use indexes, but execution takes about 6 hours. Please advice. Thanks, Tim. |
| |||
| Take the OPTIMIZE FOR 1 ROW off the query and feed it into the DB2 Design Advisor for more advice. Given the amount of scanning involved, you will probably want to increase tablespace prefetch size to pull in more extents from disk per prefetch request. Enabling intra-partition parallelism may help if the server is SMP. Block-based bufferpools are good at improving performance on sequential scans. When posting questions of this nature, it helps to provide details about your environment, including: - DB2 version and FixPak level - Hardware information - any differences between your MS SQL Server and DB2 installations Fred |
| |||
| Thank you, Fred. Nothing helps in my situation.Obviosly this is DB2 Optimizer mistake. Microsot SQL Server handles my situations much better. Thanks, T. <fred.sobotka@gmail.com> wrote in message news:1117085596.831862.183310@f14g2000cwb.googlegr oups.com... > Take the OPTIMIZE FOR 1 ROW off the query and feed it into the DB2 > Design Advisor for more advice. > > Given the amount of scanning involved, you will probably want to > increase tablespace prefetch size to pull in more extents from disk per > prefetch request. Enabling intra-partition parallelism may help if the > server is SMP. Block-based bufferpools are good at improving > performance on sequential scans. > > When posting questions of this nature, it helps to provide details > about your environment, including: > - DB2 version and FixPak level > - Hardware information > - any differences between your MS SQL Server and DB2 installations > > Fred > |
| |||
| Timur wrote: > Thank you, Fred. > Nothing helps in my situation.Obviosly this is DB2 Optimizer mistake. > Microsot SQL Server handles my situations much better. If you don't believe that anything can help, then why to you post a question? You are quick to pass judgement, yet you have given the group no information to help you. If you post the db2exmt output then we maybe able to help. Cheers Serge -- Serge Rielau DB2 SQL Compiler Development IBM Toronto Lab |
| |||
| Hi Serge, Let me explain my situation. I am working on OLAP project and my datamart is located on Microsoft SQL Server. I have moved this datamart to DB2 (UNIX) and Iwas impressed about performance for ETL which loads data to DB2. But my second step was to LOAD my OLAP database which is for DB2 means: SELECT a,b,c,d,.....FROM View - 60 millions rows, about 15 GB of data , takes forever! ( this process takes about 1hour 20 min with MS SQL Server.) View = SELECT A (55millions row table) JOIN B (3 millions row tabel) ON A.B_PK_CLUSTER = B.B_PK_CLUSTER JOIN C (1 million rrow table) ON A.C_PK_CLUSTER = C.C_PK_CLUSTER LEFT JOIN D (small) ON PK_CLUSTER....... Plan shows that DB2 TABLSCAN table A --> SORT it by B_PK_CLUSTER field and HASH it to TABLESCANED table B!! even when I add FETCH FIRST 1 ROW ONLY - takes forever!! I changed my View like this: SELECT a,b,c,d,.....FROM tableA ( just fact table) and I loaded my OLAP in 15 min !! ( 60 millions rows, about 15 GB of data ) - Fantastic. What does it mean? I think it means I have no problem with OLAP, network,drivers, etc. So I tryed this ( only 2 tables) : SELECT a,b,c, ...... FROM A (55millions row table) JOIN B (3 millions row tabel) ON A.B_PK_CLUSTER = B.B_PK_CLUSTER Plan is still ugly - 2 TABLESCANS - very slow. I think you may create this situation very easy. IF I add WHERE clause to narrow table A to let say 1 million rows - plan is good, DB2 accept indexes, works fast. What does it mean? I think it means I do have indexes and they are good. After RANSTATS on just B_PK_CLUSTER field on table A . My next SELECT selects ONLY fields from table A and ONLY PK from table B.: SELECT A.f1, A.f2, A.B_PK_CLUSTER , B.B_PK_CLUSTER FROM A (55millions row table) JOIN B (3 millions row tabel) ON A.B_PK_CLUSTER = B.B_PK_CLUSTER Plan is perfect, DB2 uses indexes , very fast. But If I add to SELECT above some another fields from table B - plan again becomes TABLESCAN type, very slow. I tryed use Functions instead of JOINs -- it improves situation but still very slow - about 6 hour to load my cube. I have to know what to expect when I will need to load 100 million row, 200, 300. Limits, thresholds, buffer sizes, pools, memory, CPUs, etc...$$$$? How to hint to DB2 to use this particular index or that type of Join? Please advice. Thanks, Timur. "Serge Rielau" <srielau@ca.ibm.com> wrote in message news:3fudctF9nfrbU1@individual.net... > Timur wrote: > > Thank you, Fred. > > Nothing helps in my situation.Obviosly this is DB2 Optimizer mistake. > > Microsot SQL Server handles my situations much better. > If you don't believe that anything can help, then why to you post a > question? > You are quick to pass judgement, yet you have given the group no > information to help you. > > If you post the db2exmt output then we maybe able to help. > > Cheers > Serge > -- > Serge Rielau > DB2 SQL Compiler Development > IBM Toronto Lab |
| ||||
| Timur wrote: > Hi Serge, > Let me explain my situation. > I am working on OLAP project and my datamart is located on Microsoft SQL > Server. > I have moved this datamart to DB2 (UNIX) and Iwas impressed about > performance for ETL which loads data to DB2. > But my second step was to LOAD my OLAP database which is for DB2 means: > > SELECT a,b,c,d,.....FROM View > > - 60 millions rows, about 15 GB of data , takes forever! > ( this process takes about 1hour 20 min with MS SQL Server.) > > View = SELECT A (55millions row table) JOIN B (3 millions row tabel) ON > A.B_PK_CLUSTER = B.B_PK_CLUSTER > JOIN C (1 million rrow table) ON A.C_PK_CLUSTER = > C.C_PK_CLUSTER > LEFT JOIN D (small) ON PK_CLUSTER....... > > Plan shows that DB2 TABLSCAN table A --> SORT it by B_PK_CLUSTER field and > HASH it to TABLESCANED table B!! > even when I add FETCH FIRST 1 ROW ONLY - takes forever!! > > I changed my View like this: > SELECT a,b,c,d,.....FROM tableA ( just fact table) > and I loaded my OLAP in 15 min !! ( 60 millions rows, about 15 GB of > data ) - Fantastic. > What does it mean? I think it means I have no problem with OLAP, > network,drivers, etc. > > So I tryed this ( only 2 tables) : > SELECT a,b,c, ...... > FROM A (55millions row table) JOIN B (3 millions row tabel) ON > A.B_PK_CLUSTER = B.B_PK_CLUSTER > > Plan is still ugly - 2 TABLESCANS - very slow. I think you may create this > situation very easy. > > IF I add WHERE clause to narrow table A to let say 1 million rows - plan is > good, DB2 accept indexes, works fast. > What does it mean? I think it means I do have indexes and they are good. > > After RANSTATS on just B_PK_CLUSTER field on table A . > My next SELECT selects ONLY fields from table A and ONLY PK from table B.: > > SELECT A.f1, A.f2, A.B_PK_CLUSTER , B.B_PK_CLUSTER > FROM A (55millions row table) JOIN B (3 millions row tabel) ON > A.B_PK_CLUSTER = B.B_PK_CLUSTER > > Plan is perfect, DB2 uses indexes , very fast. > > But If I add to SELECT above some another fields from table B - plan again > becomes TABLESCAN type, very slow. > > I tryed use Functions instead of JOINs -- it improves situation but still > very slow - about 6 hour to load my cube. > I have to know what to expect when I will need to load 100 million row, 200, > 300. > Limits, thresholds, buffer sizes, pools, memory, CPUs, etc...$$$$? > How to hint to DB2 to use this particular index or that type of Join? > Please advice. > Thanks, > Timur. How is the filtering of these joins? Keep in mind that DB2 has to go get the row from the datapage to get those extra columns. Are you aware of the INCLUDE clause on unique indices? To use it with a primary key CREATE UNIQUE INDEX .... ON T(pk) INCLUDE (....) Then: ALTER TABLE .. ADD PRIMARY KEY... "An existing index <blah> has been reused" If you can't use INCLUDE columns you can play a game: First do the JOINS without pulling any of the secondary columns up. Pull only the primary keys. Then, when all the filtering is done (an dyou're down to 1M rows, join back to the tables to pull up the extra, non indexed columns. Possibly it's sufficient to do this trick only for the 55M table. Again, if you would post the plan it would be easy to see the filtering and WHY DB2 chooses the plan it does. Cheers Serge -- Serge Rielau DB2 SQL Compiler Development IBM Toronto Lab |