vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, Db2 Linux LUW FP 15. Consider table A (ID varchar, EXECUTION_DATE date). a) I want to first retrieve all IDs that have not executed during the last 90 days: select distinct ID from table_B where NOT EXISTS (select 1 from table_A where EXECUTION_DATE between current date - 90 days and current date AND table_A.ID = table_B.ID) b) Now I need the last execution for all IDs that have NOT executed during the last 90 days: select TABLE_A.ID, MAX(EXECUTION_DATE) from TABLE_A, ( select distinct ID from table_B where NOT EXISTS (select 1 from table_A where EXECUTION_DATE between current date - 90 days and current date AND table_A.ID = table_B.ID) ) as TABLE_C where TABLE_A.ID = TABLE_C.ID and EXECUTION_DATE < current date - 90 days group by TABLE_A.ID That was easy. The question is ... is there any way to do the same thing using a better approach, perhaps SINGLE table scan on table_A ? This table is really big. TIA, Michel. |
| |||
| On Nov 30, 6:33 pm, Michel Esber <mic...@us.automatos.com> wrote: > Hello, > > Db2 Linux LUW FP 15. > > Consider table A (ID varchar, EXECUTION_DATE date). > > a) I want to first retrieve all IDs that have not executed during the > last 90 days: > > select distinct ID from table_B where NOT EXISTS > (select 1 > from table_A > where EXECUTION_DATE between current date - 90 days and current > date AND > table_A.ID = table_B.ID) > > b) Now I need the last execution for all IDs that have NOT executed > during the last 90 days: > > select TABLE_A.ID, MAX(EXECUTION_DATE) from TABLE_A, > > ( > select distinct ID from table_B where NOT EXISTS > (select 1 > from table_A > where EXECUTION_DATE between current date - 90 days and current > date AND > table_A.ID = table_B.ID) > ) as TABLE_C > > where TABLE_A.ID = TABLE_C.ID and > EXECUTION_DATE < current date - 90 days > > group by TABLE_A.ID > > That was easy. The question is ... is there any way to do the same > thing using a better approach, perhaps SINGLE table scan on table_A ? > This table is really big. > > TIA, Michel. What indexes do you have on table_A and table_B, and what access plan do you currently have? /Lennart |
| |||
| On Nov 30, 10:59 pm, Lennart <Erik.Lennart.Jons...@gmail.com> wrote: > On Nov 30, 6:33 pm, Michel Esber <mic...@us.automatos.com> wrote: > > > > > Hello, > > > Db2 Linux LUW FP 15. > > > Consider table A (ID varchar, EXECUTION_DATE date). > > > a) I want to first retrieve all IDs that have not executed during the > > last 90 days: > > > select distinct ID from table_B where NOT EXISTS > > (select 1 > > from table_A > > where EXECUTION_DATE between current date - 90 days and current > > date AND > > table_A.ID = table_B.ID) > > > b) Now I need the last execution for all IDs that have NOT executed > > during the last 90 days: > > > select TABLE_A.ID, MAX(EXECUTION_DATE) from TABLE_A, > > > ( > > select distinct ID from table_B where NOT EXISTS > > (select 1 > > from table_A > > where EXECUTION_DATE between current date - 90 days and current > > date AND > > table_A.ID = table_B.ID) > > ) as TABLE_C > > > where TABLE_A.ID = TABLE_C.ID and > > EXECUTION_DATE < current date - 90 days > > > group by TABLE_A.ID > > > That was easy. The question is ... is there any way to do the same > > thing using a better approach, perhaps SINGLE table scan on table_A ? > > This table is really big. > > > TIA, Michel. > > What indexes do you have on table_A and table_B, and what access plan > do you currently have? > In addition, what are the number of rows for A and B? /Lennart |
| |||
| Maybe I'm missing something obvious....: SELECT ID, MAX(EXECUTION_DATE) FROM TABLEA GROUP BY ID HAVING MAX(EXECUTION_DATE) < CURRENT DATE - 90 DAYS Cheers Serge -- Serge Rielau DB2 Solutions Development IBM Toronto Lab |
| |||
| On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm.com> wrote: > Maybe I'm missing something obvious....: > > SELECT ID, MAX(EXECUTION_DATE) FROM TABLEA GROUP BY ID > HAVING MAX(EXECUTION_DATE) < CURRENT DATE - 90 DAYS > > Cheers > Serge > -- > Serge Rielau > DB2 Solutions Development > IBM Toronto Lab Hi Serge, In fact, this query does return the last execution prior to current date - 90 days, but it does not consider that IDs should not execute between current date - 90 days and current date The above query returns IDs that have and have not executed between current date - 90 days and current date. The query I have posted works fine. I am looking for a way to tune it ... Is there any way to produce the same result set with a single table scan ? TIA, Michel. |
| |||
| On 30 nov, 20:00, Lennart <Erik.Lennart.Jons...@gmail.com> wrote: > On Nov 30, 10:59 pm, Lennart <Erik.Lennart.Jons...@gmail.com> wrote: > > > > > > > On Nov 30, 6:33 pm, Michel Esber <mic...@us.automatos.com> wrote: > > > > Hello, > > > > Db2 Linux LUW FP 15. > > > > Consider table A (ID varchar, EXECUTION_DATE date). > > > > a) I want to first retrieve all IDs that have not executed during the > > > last 90 days: > > > > select distinct ID from table_B where NOT EXISTS > > > (select 1 > > > from table_A > > > where EXECUTION_DATE between current date - 90 days and current > > > date AND > > > table_A.ID = table_B.ID) > > > > b) Now I need the last execution for all IDs that have NOT executed > > > during the last 90 days: > > > > select TABLE_A.ID, MAX(EXECUTION_DATE) from TABLE_A, > > > > ( > > > select distinct ID from table_B where NOT EXISTS > > > (select 1 > > > from table_A > > > where EXECUTION_DATE between current date - 90 days and current > > > date AND > > > table_A.ID = table_B.ID) > > > ) as TABLE_C > > > > where TABLE_A.ID = TABLE_C.ID and > > > EXECUTION_DATE < current date - 90 days > > > > group by TABLE_A.ID > > > > That was easy. The question is ... is there any way to do the same > > > thing using a better approach, perhaps SINGLE table scan on table_A ? > > > This table is really big. > > > > TIA, Michel. > > > What indexes do you have on table_A and table_B, and what access plan > > do you currently have? > > In addition, what are the number of rows for A and B? > > /Lennart Original Statement: ------------------ select R.MACHINE_ID, R.SW_ID, MAX(MONITOR_DATE) from ASSET.T_METERING R inner join ( select distinct I.MACHINE_ID, I.SW_ID from ASSET.TBL_ASSET_SW I where NOT EXISTS (select 1 from ASSET.T_METERING M where MONITOR_DATE between current date - 90 days and current date and M. MACHINE_ID = I.MACHINE_ID and M.SW_ID = I.SW_ID fetch first 1 rows only) ) as C on (C.MACHINE_ID = R.MACHINE_ID and C.SW_ID = R.SW_ID) where MONITOR_DATE < current date - 90 days group by R.MACHINE_ID, R.SW_ID Access Plan: ----------- Total Cost: 3507.52 Query Degree: 1 Rows RETURN ( 1) Cost I/O | 1.60851e-05 GRPBY ( 2) 3507.52 325.697 | 1.60851e-05 TBSCAN ( 3) 3507.52 325.697 | 0.00401062 SORT ( 4) 3507.51 325.697 | 0.00401062 NLJOIN ( 5) 3507.51 325.697 /---+---\ 249.338 0.0172951 TBSCAN IXSCAN ( 6) ( 13) 2514.15 25.0315 196.182 1 | | 249.338 18785 SORT INDEX: ASSET ( 7) MTRIDX03 2514.12 196.182 | 249.338 TBSCAN ( 8) 2513.97 196.182 | 249.338 SORT ( 9) 2513.94 196.182 | 249.338 HSJOIN ( 10) 2513.78 196.182 /------+------\ 3500.53 4325 IXSCAN IXSCAN ( 11) ( 12) 1034.09 1478.12 137.182 59 | | 18785 4325 INDEX: ASSET INDEX: SYSIBM MTRIDX03 SQL061207070618270 Indexes: Table = T_METERING, index = ASSET.MTRIDX03 Columns = +SW_ID +MONITOR_DATE+MACHINE_ID Table = TBL_ASSET_SW, index = SYSIBM.SQL061207070618270 Columns = +MACHINE_ID+SW_ID T_METERING has 20M rows, TBL_ASSET_SW has 5M rows. Thanks, Michel. |
| |||
| On Nov 30, 9:33 am, Michel Esber <mic...@us.automatos.com> wrote: > Hello, > > Db2 Linux LUW FP 15. > > Consider table A (ID varchar, EXECUTION_DATE date). > > a) I want to first retrieve all IDs that have not executed during the > last 90 days: > > select distinct ID from table_B where NOT EXISTS > (select 1 > from table_A > where EXECUTION_DATE between current date - 90 days and current > date AND > table_A.ID = table_B.ID) > > b) Now I need the last execution for all IDs that have NOT executed > during the last 90 days: > > select TABLE_A.ID, MAX(EXECUTION_DATE) from TABLE_A, > > ( > select distinct ID from table_B where NOT EXISTS > (select 1 > from table_A > where EXECUTION_DATE between current date - 90 days and current > date AND > table_A.ID = table_B.ID) > ) as TABLE_C > > where TABLE_A.ID = TABLE_C.ID and > EXECUTION_DATE < current date - 90 days > > group by TABLE_A.ID > > That was easy. The question is ... is there any way to do the same > thing using a better approach, perhaps SINGLE table scan on table_A ? > This table is really big. I read the rest of the thread. I'll see what I can do with this. First, let's restate the problem: you want the set of all IDs that have not executed in the last 90 days, and you want the date of their last execution. -- id, and the last execution date of all ids that join to the inner query SELECT a.id, max(execution_date) FROM table_a a JOIN ( -- the set of all ids that have not executed in the last 90 days SELECT id FROM table_a GROUP BY id HAVING MAX(execution_date) < CURRENT DATE - 90 DAYS ) lt90 ON a.id = lt90.id In your last post on this, it looks like you're interested in SW_ID and MACHINE_ID, and there is an index that contains those columns plus the date you're interested in, so I would expect this query to be reasonably efficient. |
| |||
| Michel Esber wrote: > On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm.com> wrote: >> Maybe I'm missing something obvious....: >> >> SELECT ID, MAX(EXECUTION_DATE) FROM TABLEA GROUP BY ID >> HAVING MAX(EXECUTION_DATE) < CURRENT DATE - 90 DAYS >> >> Cheers >> Serge >> -- >> Serge Rielau >> DB2 Solutions Development >> IBM Toronto Lab > > Hi Serge, > > In fact, this query does return the last execution prior to current > date - 90 days, but it does not consider that IDs should not execute > between current date - 90 days and current date > > The above query returns IDs that have and have not executed between > current date - 90 days and current date. > > The query I have posted works fine. I am looking for a way to tune > it ... Is there any way to produce the same result set with a single > table scan ? > > TIA, Michel. Serge's query works properly on my LUW FP14 system. I'd recommend trying a small test case (a sample table containing three rows) to verify that this is failing on your system. If it's failing, then it needs to be reported. I strongly doubt that FP15 will break this simple SQL statement. [psherman@T40 ~]$ db2 "create table t1 (id char(2) not null, execution_date date not null with default)" DB20000I The SQL command completed successfully. [psherman@T40 ~]$ db2 "insert into t1 values ('A','2007-12-01')" DB20000I The SQL command completed successfully. [psherman@T40 ~]$ db2 "insert into t1 values ('A','2006-12-01')" DB20000I The SQL command completed successfully. [psherman@T40 ~]$ db2 "insert into t1 values ('B','2006-12-01')" DB20000I The SQL command completed successfully. [psherman@T40 ~]$ db2 "select id, max(execution_date) from t1 group by id" ID 2 -- ---------- A 12/01/2007 B 12/01/2006 2 record(s) selected. [psherman@T40 ~]$ db2 "select id, max(execution_date) from t1 group by id having max(execution_date) < current_date - 90 days" ID 2 -- ---------- B 12/01/2006 1 record(s) selected. Phil Sherman |
| |||
| On 2 dez, 13:01, Phil Sherman <psher...@ameritech.net> wrote: > Michel Esber wrote: > > On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm.com> wrote: > >> Maybe I'm missing something obvious....: > > >> SELECT ID, MAX(EXECUTION_DATE) FROM TABLEA GROUP BY ID > >> HAVING MAX(EXECUTION_DATE) < CURRENT DATE - 90 DAYS > > >> Cheers > >> Serge > >> -- > >> Serge Rielau > >> DB2 Solutions Development > >> IBM Toronto Lab > > > Hi Serge, > > > In fact, this query does return the last execution prior to current > > date - 90 days, but it does not consider that IDs should not execute > > between current date - 90 days and current date > > > The above query returns IDs that have and have not executed between > > current date - 90 days and current date. > > > The query I have posted works fine. I am looking for a way to tune > > it ... Is there any way to produce the same result set with a single > > table scan ? > > > TIA, Michel. > > Serge's query works properly on my LUW FP14 system. I'd recommend trying > a small test case (a sample table containing three rows) to verify that > this is failing on your system. If it's failing, then it needs to be > reported. I strongly doubt that FP15 will break this simple SQL statement. > > [psherman@T40 ~]$ db2 "create table t1 (id char(2) not null, > execution_date date not null with default)" > DB20000I The SQL command completed successfully. > [psherman@T40 ~]$ db2 "insert into t1 values ('A','2007-12-01')" > DB20000I The SQL command completed successfully. > [psherman@T40 ~]$ db2 "insert into t1 values ('A','2006-12-01')" > DB20000I The SQL command completed successfully. > [psherman@T40 ~]$ db2 "insert into t1 values ('B','2006-12-01')" > DB20000I The SQL command completed successfully. > [psherman@T40 ~]$ db2 "select id, max(execution_date) from t1 group by id" > ID 2 > -- ---------- > A 12/01/2007 > B 12/01/2006 > > 2 record(s) selected. > > [psherman@T40 ~]$ db2 "select id, max(execution_date) from t1 group by > id having max(execution_date) < current_date - 90 days" > > ID 2 > -- ---------- > B 12/01/2006 > > 1 record(s) selected. > > Phil Sherman- Phil, Serge and Dean, thank you for your time. Let me try to explain why this is incorrect: Consider the following table data: db2 "select MONITOR_DATE from ASSET.T_METERING where MACHINE_ID='000B6A10620ED121C9481542' and SW_ID=77" MONITOR_DATE ------------ 10/17/2007 10/18/2007 11/30/2007 If I run my original query: db2 "select MACHINE_ID, SW_ID from ASSET.TBL_ASSET_SW S where not exists (select 1 from ASSET.T_METERING M where M.MACHINE_ID=S.MACHINE_ID and S.SW_ID=M.SW_ID and MONITOR_DATE between current date - 30 days and current date) and MACHINE_ID='000B6A10620ED121C9481542' and SW_ID=77" I get no results. Thatīs abovius, because there was an execution on 11/30/2007. Note that I do need to join with TBL_ASSET_SW for other purposes. Now, executing the queries you have suggested: db2 "select MACHINE_ID,SW_ID from ASSET.T_METERING where MACHINE_ID='000B6A10620ED121C9481542' group by MACHINE_ID, SW_ID having MAX(MONITOR_DATE) < current date - 90 days" SW_ID 77 is not returned here, which is nice. Next: db2 "select S.MACHINE_ID, S.SW_ID from ASSET.TBL_ASSET_SW S where NOT EXISTS (select MACHINE_ID, SW_ID from ASSET.T_METERING M where M.MACHINE_ID='000B6A10620ED121C9481542' and S.MACHINE_ID = M.MACHINE_ID and S.SW_ID = M.SW_ID group by MACHINE_ID, SW_ID having MAX(MONITOR_DATE) < current date - 30 days) and MACHINE_ID='000B6A10620ED121C9481542' and SW_ID=77" It does return one row: MACHINE_ID SW_ID ------------------------ ---------- 000B6A10620ED121C9481542 77. I understand whatīs happening here, and the result set is not incorrect (according to the query). But for my scenario, this is just wrong. I need to certify that the execution did not happen between current date - X days and current date. Hope that it makes sense now. Thanks again. |
| ||||
| Michel Esber wrote: > I understand whatīs happening here, and the result set is not > incorrect (according to the query). I do not understand what's happening here. > Hope that it makes sense now. No it doesn't, you are assuming we have the same local knowledge as you. 1. We are missing the sample data for the asset table 2. You keep flip flopping between 30 days and 90 days which confuses the matter 3. Shouldn't matter, but: Why are you adding the literal into the NOT EXISTS with the HAVING? I propose we restart with a clean slate: Define both tables, define contents for both. Define the output. Cheers Serge -- Serge Rielau DB2 Solutions Development IBM Toronto Lab |