This is a discussion on DB2 subselect problem within the DB2 forums, part of the Database Server Software category; --> I have the following statement which I run successfully in... 1 hour 10 minutes. SELECT a.tsgicd as ACCT_ID, a.tsa5cd ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have the following statement which I run successfully in... 1 hour 10 minutes. SELECT a.tsgicd as ACCT_ID, a.tsa5cd as SEC_ID, CASE WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R' or SUBSTRING(a.tsgicd, 6, 1) = 'Y' or SUBSTRING(a.tsgicd, 6, 1) = '0' or SUBSTRING(a.tsgicd, 6, 1) = '1' or SUBSTRING(a.tsgicd, 6, 1) = '2' or SUBSTRING(a.tsgicd, 6, 1) = '3' or SUBSTRING(a.tsgicd, 6, 1) = '4' or SUBSTRING(a.tsgicd, 6, 1) = '5' or SUBSTRING(a.tsgicd, 6, 1) = '6' or SUBSTRING(a.tsgicd, 6, 1) = '7' or SUBSTRING(a.tsgicd, 6, 1) = '8' or SUBSTRING(a.tsgicd, 6, 1) = '9' ) THEN (select max(main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff) from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and bv.BVSECID=A.TSA5CD )) ELSE NULL END as BV_MNT, a.tsalpc as LOAN_RATE, a.tsbpva as LOAN_VALUE, a.tsbqva as MKT_VALUE FROM table1 a left outer join table2 d on a.tsannb = d.itannb left outer join table3 G on a.tsannb = G.IXANNB, table4 b, table5 c, table6 e, table7 f WHERE a.tsgicd >= '01 ' and (a.tsgicd <= '7999999' or a.tsgicd = '8860686' or a.tsgicd = '8860694' or a.tsgicd = '8860306' or a.tsgicd = '8860314' or a.tsgicd = '8860520' or a.tsgicd = '8860538' or a.tsgicd = '8860801' or a.tsgicd = '8860819') and a.tsgicd = b.acgicd and b.acbodc = 0 and b.acd2st <> 'D' and b.acbwcd = c.orrept and c.ordivi = 'ESCPT' and a.tsannb = e.tiannb and e.ticaid = f.caid I know that the problem comes from the utilisation of the Max function. Instead of using the max function, I wanted the subselect to do something like this: (select first(main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc ) OR (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch first row only) OR (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit 1) Unfortunately, the Order by clause is not permitted in a subselect, First doesn't exist in DB2 and Fetch first x-rows is not allowed in a subselect... Does anyone have a solution? Thanks in advance. |
| |||
| If you meant to use IN instead of bvdteeff =, there is no particular reason, except the fact that it explicitly tells that there is only one possible result, but since I use a max() in the sub-sub query, I will always get only one result. Also, using IN should be slower. |
| |||
| "Sebastien" <roy.sebastien@gmail.com> wrote in message news:1135346942.274586.77170@f14g2000cwb.googlegro ups.com... >I have the following statement which I run successfully in... 1 hour 10 > minutes. > > SELECT > a.tsgicd as ACCT_ID, > a.tsa5cd as SEC_ID, > CASE > WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R' > or SUBSTRING(a.tsgicd, 6, 1) = 'Y' > or SUBSTRING(a.tsgicd, 6, 1) = '0' > or SUBSTRING(a.tsgicd, 6, 1) = '1' > or SUBSTRING(a.tsgicd, 6, 1) = '2' > or SUBSTRING(a.tsgicd, 6, 1) = '3' > or SUBSTRING(a.tsgicd, 6, 1) = '4' > or SUBSTRING(a.tsgicd, 6, 1) = '5' > or SUBSTRING(a.tsgicd, 6, 1) = '6' > or SUBSTRING(a.tsgicd, 6, 1) = '7' > or SUBSTRING(a.tsgicd, 6, 1) = '8' > or SUBSTRING(a.tsgicd, 6, 1) = '9' > ) THEN (select max(main.bvmnt) from > vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and > main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff) > from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and > bv.BVSECID=A.TSA5CD )) > ELSE NULL > END as BV_MNT, > a.tsalpc as LOAN_RATE, > a.tsbpva as LOAN_VALUE, > a.tsbqva as MKT_VALUE > FROM > table1 a > left outer join table2 d on a.tsannb = d.itannb > left outer join table3 G on a.tsannb = G.IXANNB, > table4 b, > table5 c, > table6 e, > table7 f > WHERE > a.tsgicd >= '01 ' > and (a.tsgicd <= '7999999' or > a.tsgicd = '8860686' or > a.tsgicd = '8860694' or > a.tsgicd = '8860306' or > a.tsgicd = '8860314' or > a.tsgicd = '8860520' or > a.tsgicd = '8860538' or > a.tsgicd = '8860801' or > a.tsgicd = '8860819') > and a.tsgicd = b.acgicd > and b.acbodc = 0 > and b.acd2st <> 'D' > and b.acbwcd = c.orrept > and c.ordivi = 'ESCPT' > and a.tsannb = e.tiannb > and e.ticaid = f.caid > > I know that the problem comes from the utilisation of the Max function. > Instead of using the max function, I wanted the subselect to do > something like this: > > (select first(main.bvmnt) from vmdprdta.gnbvfm00 main where > main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff > desc ) > > OR > > (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte > =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch > first row only) > > OR > > (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte > =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit > 1) > > Unfortunately, the Order by clause is not permitted in a subselect, > First doesn't exist in DB2 and Fetch first x-rows is not allowed in a > subselect... > > Does anyone have a solution? > > Thanks in advance. > What makes you think that max() is the problem? Have you used Explain to prove this? If not, I think you should do an Explain to see what it says. I don't see any inherent reason why max() should be slow. In fact, I think the opposite is more likely to be true: if the column in which you are trying to get a max() value has an index on it, DB2 could very well get you that max() value via a single access to the root page of the index which should be very fast. Also, I don't think that in() is inherently any slower than using several equals conditions that have been "OR"ed together. Due to DB2 query rewrite capabilities, I think the two approaches would have the same performance. Also, the IN version would be shorter and easier to read, which makes it easier to maintain. One other small thing: it's a bad idea to make literals in conditions longer than they need to be. This predicate: a.tsgicd >= '01 ' can actually hurt your performance significantly if the literal is longer than the column to which you are comparing it. For example, if the tsgicd column is only 3 characters long and the search condition is longer than that, you guarantee that no index can be used on that column to find qualifying rows. Make sure that the search value, '01 ' in this case, is no wider than the tsgicd column itself. That alone might help your query significantly - or not, since I see that your other predicates involving the tsgicd column are also long. But it is still wise to be sure that '01 ' is no longer than the column; if you typed just one extra blank, which is easy to do, you'd find that the query could suffer. My advice to you is to try using whatever flavour of Explain you like best to look at the query and see where the bottlenecks are. Also, try some variations like using in() instead of multiple equal predicates linked with OR and see if it makes any difference at all. Personally, I like Visual Explain which clearly shows which the cost of each part of the query and gives clues as to why that part is expensive. But you didn't state which variety of DB2 you are using or what platform you are on so I'm not sure if you have Visual Explain. Rhino |
| |||
| Sebastien wrote: > I have the following statement which I run successfully in... 1 hour 10 > minutes. > > SELECT > a.tsgicd as ACCT_ID, > a.tsa5cd as SEC_ID, > CASE > WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R' > or SUBSTRING(a.tsgicd, 6, 1) = 'Y' > or SUBSTRING(a.tsgicd, 6, 1) = '0' > or SUBSTRING(a.tsgicd, 6, 1) = '1' > or SUBSTRING(a.tsgicd, 6, 1) = '2' > or SUBSTRING(a.tsgicd, 6, 1) = '3' > or SUBSTRING(a.tsgicd, 6, 1) = '4' > or SUBSTRING(a.tsgicd, 6, 1) = '5' > or SUBSTRING(a.tsgicd, 6, 1) = '6' > or SUBSTRING(a.tsgicd, 6, 1) = '7' > or SUBSTRING(a.tsgicd, 6, 1) = '8' > or SUBSTRING(a.tsgicd, 6, 1) = '9' > ) THEN (select max(main.bvmnt) from > vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and > main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff) > from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and > bv.BVSECID=A.TSA5CD )) > ELSE NULL > END as BV_MNT, > a.tsalpc as LOAN_RATE, > a.tsbpva as LOAN_VALUE, > a.tsbqva as MKT_VALUE > FROM > table1 a > left outer join table2 d on a.tsannb = d.itannb > left outer join table3 G on a.tsannb = G.IXANNB, > table4 b, > table5 c, > table6 e, > table7 f > WHERE > a.tsgicd >= '01 ' > and (a.tsgicd <= '7999999' or > a.tsgicd = '8860686' or > a.tsgicd = '8860694' or > a.tsgicd = '8860306' or > a.tsgicd = '8860314' or > a.tsgicd = '8860520' or > a.tsgicd = '8860538' or > a.tsgicd = '8860801' or > a.tsgicd = '8860819') > and a.tsgicd = b.acgicd > and b.acbodc = 0 > and b.acd2st <> 'D' > and b.acbwcd = c.orrept > and c.ordivi = 'ESCPT' > and a.tsannb = e.tiannb > and e.ticaid = f.caid > > I know that the problem comes from the utilisation of the Max function. > Instead of using the max function, I wanted the subselect to do > something like this: > > (select first(main.bvmnt) from vmdprdta.gnbvfm00 main where > main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff > desc ) > > OR > > (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte > =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch > first row only) > > OR > > (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte > =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit > 1) > > Unfortunately, the Order by clause is not permitted in a subselect, > First doesn't exist in DB2 and Fetch first x-rows is not allowed in a > subselect... > > Does anyone have a solution? > > Thanks in advance. > Which version/platform of DB2? DB2 V8 for LUW supports both ORDER By and FETCH FIRST is subqueries strating with FP2 I believe. Cheers Serge -- Serge Rielau DB2 Solutions Development DB2 UDB for Linux, Unix, Windows IBM Toronto Lab |
| |||
| OR predicates can be rewritten to an IN predicate by DB2, if they are in where clause but not in CASE statement. IN predicate is supposed to be more efficient than multiple ORs in that it calculates substr only once. Also check the explain if inner join predicates and local predicates on table a, b and c are pushed down below outer join. If not, you may move all or some WHERE predicates to the FROM clause, because they would prune off many rows prior to the outjoin computation. |
| ||||
| I doubt that the MAX() is the problem. What seems more likely to me (not having a database to check the plan on this query) is that the problem would be the nested - nested correlated subqueries that you have setup. I would suggest making sure the table vmdprdta.gnbvfm00 is indexed ideally for the subqueries on the two columns you are querying on. Or, maybe even better, rewrite the query to not use any correlated subqueries at all. Here is an example that should work (it also incorporates the suggestions of using the IN clause, too): with not_subselect(bvmnt, row_num) as (select bvmnt, row_number() over(partition by bv#cpte, bvsecid order by bvdteef desc) as row_num), not_sub_2(bvmnt) as (select bvmnt from not_subselect where row_num = 1) SELECT a.tsgicd as ACCT_ID, a.tsa5cd as SEC_ID, CASE WHEN SUBSTRING(a.tsgicd, 6, 1) in ('R' , 'Y', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') THEN ns2.bvmnt ELSE NULL END as bv_mnt, a.tsalpc as LOAN_RATE, a.tsbpva as LOAN_VALUE, a.tsbqva as MKT_VALUE FROM table1 a left outer join table2 d on a.tsannb = d.itannb left outer join table3 G on a.tsannb = G.IXANNB, table4 b, table5 c, table6 e, table7 f, left outer join not_sub_2 ns2 on ns2.bv#cpte = A.TSGICD and ns2.bvsecid = A.TSA5CD WHERE (a.tsgicd between '01 ' and '7999999' or a.tsgicd in ('8860686', '8860694', '8860306', '8860314', '8860520', '8860538', '8860801', '8860819')) and a.tsgicd = b.acgicd and b.acbodc = 0 and b.acd2st <> 'D' and b.acbwcd = c.orrept and c.ordivi = 'ESCPT' and a.tsannb = e.tiannb and e.ticaid = f.caid Chris |