Unix Technical Forum

DB2 subselect problem

This is a discussion on DB2 subselect problem within the DB2 forums, part of the Database Server Software category; --> I have the following statement which I run successfully in... 1 hour 10 minutes. SELECT a.tsgicd as ACCT_ID, a.tsa5cd ...


Go Back   Unix Technical Forum > Database Server Software > DB2

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-27-2008, 05:40 AM
Sebastien
 
Posts: n/a
Default DB2 subselect problem

I have the following statement which I run successfully in... 1 hour 10
minutes.

SELECT
a.tsgicd as ACCT_ID,
a.tsa5cd as SEC_ID,
CASE
WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R'
or SUBSTRING(a.tsgicd, 6, 1) = 'Y'
or SUBSTRING(a.tsgicd, 6, 1) = '0'
or SUBSTRING(a.tsgicd, 6, 1) = '1'
or SUBSTRING(a.tsgicd, 6, 1) = '2'
or SUBSTRING(a.tsgicd, 6, 1) = '3'
or SUBSTRING(a.tsgicd, 6, 1) = '4'
or SUBSTRING(a.tsgicd, 6, 1) = '5'
or SUBSTRING(a.tsgicd, 6, 1) = '6'
or SUBSTRING(a.tsgicd, 6, 1) = '7'
or SUBSTRING(a.tsgicd, 6, 1) = '8'
or SUBSTRING(a.tsgicd, 6, 1) = '9'
) THEN (select max(main.bvmnt) from
vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and
main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff)
from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and
bv.BVSECID=A.TSA5CD ))
ELSE NULL
END as BV_MNT,
a.tsalpc as LOAN_RATE,
a.tsbpva as LOAN_VALUE,
a.tsbqva as MKT_VALUE
FROM
table1 a
left outer join table2 d on a.tsannb = d.itannb
left outer join table3 G on a.tsannb = G.IXANNB,
table4 b,
table5 c,
table6 e,
table7 f
WHERE
a.tsgicd >= '01 '
and (a.tsgicd <= '7999999' or
a.tsgicd = '8860686' or
a.tsgicd = '8860694' or
a.tsgicd = '8860306' or
a.tsgicd = '8860314' or
a.tsgicd = '8860520' or
a.tsgicd = '8860538' or
a.tsgicd = '8860801' or
a.tsgicd = '8860819')
and a.tsgicd = b.acgicd
and b.acbodc = 0
and b.acd2st <> 'D'
and b.acbwcd = c.orrept
and c.ordivi = 'ESCPT'
and a.tsannb = e.tiannb
and e.ticaid = f.caid

I know that the problem comes from the utilisation of the Max function.
Instead of using the max function, I wanted the subselect to do
something like this:

(select first(main.bvmnt) from vmdprdta.gnbvfm00 main where
main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff
desc )

OR

(select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
=A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch
first row only)

OR

(select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
=A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit
1)

Unfortunately, the Order by clause is not permitted in a subselect,
First doesn't exist in DB2 and Fetch first x-rows is not allowed in a
subselect...

Does anyone have a solution?

Thanks in advance.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-27-2008, 05:40 AM
Brian Tkatch
 
Posts: n/a
Default Re: DB2 subselect problem

Not an answer to the main question.

Is there a reason you are not using IN()?

B.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-27-2008, 05:40 AM
Sebastien
 
Posts: n/a
Default Re: DB2 subselect problem

If you meant to use IN instead of bvdteeff =, there is no particular
reason, except the fact that it explicitly tells that there is only one
possible result, but since I use a max() in the sub-sub query, I will
always get only one result. Also, using IN should be slower.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-27-2008, 05:40 AM
Rhino
 
Posts: n/a
Default Re: DB2 subselect problem


"Sebastien" <roy.sebastien@gmail.com> wrote in message
news:1135346942.274586.77170@f14g2000cwb.googlegro ups.com...
>I have the following statement which I run successfully in... 1 hour 10
> minutes.
>
> SELECT
> a.tsgicd as ACCT_ID,
> a.tsa5cd as SEC_ID,
> CASE
> WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R'
> or SUBSTRING(a.tsgicd, 6, 1) = 'Y'
> or SUBSTRING(a.tsgicd, 6, 1) = '0'
> or SUBSTRING(a.tsgicd, 6, 1) = '1'
> or SUBSTRING(a.tsgicd, 6, 1) = '2'
> or SUBSTRING(a.tsgicd, 6, 1) = '3'
> or SUBSTRING(a.tsgicd, 6, 1) = '4'
> or SUBSTRING(a.tsgicd, 6, 1) = '5'
> or SUBSTRING(a.tsgicd, 6, 1) = '6'
> or SUBSTRING(a.tsgicd, 6, 1) = '7'
> or SUBSTRING(a.tsgicd, 6, 1) = '8'
> or SUBSTRING(a.tsgicd, 6, 1) = '9'
> ) THEN (select max(main.bvmnt) from
> vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and
> main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff)
> from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and
> bv.BVSECID=A.TSA5CD ))
> ELSE NULL
> END as BV_MNT,
> a.tsalpc as LOAN_RATE,
> a.tsbpva as LOAN_VALUE,
> a.tsbqva as MKT_VALUE
> FROM
> table1 a
> left outer join table2 d on a.tsannb = d.itannb
> left outer join table3 G on a.tsannb = G.IXANNB,
> table4 b,
> table5 c,
> table6 e,
> table7 f
> WHERE
> a.tsgicd >= '01 '
> and (a.tsgicd <= '7999999' or
> a.tsgicd = '8860686' or
> a.tsgicd = '8860694' or
> a.tsgicd = '8860306' or
> a.tsgicd = '8860314' or
> a.tsgicd = '8860520' or
> a.tsgicd = '8860538' or
> a.tsgicd = '8860801' or
> a.tsgicd = '8860819')
> and a.tsgicd = b.acgicd
> and b.acbodc = 0
> and b.acd2st <> 'D'
> and b.acbwcd = c.orrept
> and c.ordivi = 'ESCPT'
> and a.tsannb = e.tiannb
> and e.ticaid = f.caid
>
> I know that the problem comes from the utilisation of the Max function.
> Instead of using the max function, I wanted the subselect to do
> something like this:
>
> (select first(main.bvmnt) from vmdprdta.gnbvfm00 main where
> main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff
> desc )
>
> OR
>
> (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
> =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch
> first row only)
>
> OR
>
> (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
> =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit
> 1)
>
> Unfortunately, the Order by clause is not permitted in a subselect,
> First doesn't exist in DB2 and Fetch first x-rows is not allowed in a
> subselect...
>
> Does anyone have a solution?
>
> Thanks in advance.
>

What makes you think that max() is the problem? Have you used Explain to
prove this? If not, I think you should do an Explain to see what it says.

I don't see any inherent reason why max() should be slow. In fact, I think
the opposite is more likely to be true: if the column in which you are
trying to get a max() value has an index on it, DB2 could very well get you
that max() value via a single access to the root page of the index which
should be very fast.

Also, I don't think that in() is inherently any slower than using several
equals conditions that have been "OR"ed together. Due to DB2 query rewrite
capabilities, I think the two approaches would have the same performance.
Also, the IN version would be shorter and easier to read, which makes it
easier to maintain.

One other small thing: it's a bad idea to make literals in conditions longer
than they need to be. This predicate:

a.tsgicd >= '01 '

can actually hurt your performance significantly if the literal is longer
than the column to which you are comparing it. For example, if the tsgicd
column is only 3 characters long and the search condition is longer than
that, you guarantee that no index can be used on that column to find
qualifying rows. Make sure that the search value, '01 ' in this case, is
no wider than the tsgicd column itself. That alone might help your query
significantly - or not, since I see that your other predicates involving the
tsgicd column are also long. But it is still wise to be sure that '01 '
is no longer than the column; if you typed just one extra blank, which is
easy to do, you'd find that the query could suffer.

My advice to you is to try using whatever flavour of Explain you like best
to look at the query and see where the bottlenecks are. Also, try some
variations like using in() instead of multiple equal predicates linked with
OR and see if it makes any difference at all.

Personally, I like Visual Explain which clearly shows which the cost of each
part of the query and gives clues as to why that part is expensive. But you
didn't state which variety of DB2 you are using or what platform you are on
so I'm not sure if you have Visual Explain.

Rhino


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-27-2008, 05:40 AM
Serge Rielau
 
Posts: n/a
Default Re: DB2 subselect problem

Sebastien wrote:
> I have the following statement which I run successfully in... 1 hour 10
> minutes.
>
> SELECT
> a.tsgicd as ACCT_ID,
> a.tsa5cd as SEC_ID,
> CASE
> WHEN (SUBSTRING(a.tsgicd, 6, 1) = 'R'
> or SUBSTRING(a.tsgicd, 6, 1) = 'Y'
> or SUBSTRING(a.tsgicd, 6, 1) = '0'
> or SUBSTRING(a.tsgicd, 6, 1) = '1'
> or SUBSTRING(a.tsgicd, 6, 1) = '2'
> or SUBSTRING(a.tsgicd, 6, 1) = '3'
> or SUBSTRING(a.tsgicd, 6, 1) = '4'
> or SUBSTRING(a.tsgicd, 6, 1) = '5'
> or SUBSTRING(a.tsgicd, 6, 1) = '6'
> or SUBSTRING(a.tsgicd, 6, 1) = '7'
> or SUBSTRING(a.tsgicd, 6, 1) = '8'
> or SUBSTRING(a.tsgicd, 6, 1) = '9'
> ) THEN (select max(main.bvmnt) from
> vmdprdta.gnbvfm00 main where main.bv#cpte =A.TSGICD and
> main.bvsecid=A.TSA5CD and main.bvdteeff = (select max(bv.bvdteeff)
> from vmdprdta.gnbvfm00 bv where bv.bv#cpte = A.TSGICD and
> bv.BVSECID=A.TSA5CD ))
> ELSE NULL
> END as BV_MNT,
> a.tsalpc as LOAN_RATE,
> a.tsbpva as LOAN_VALUE,
> a.tsbqva as MKT_VALUE
> FROM
> table1 a
> left outer join table2 d on a.tsannb = d.itannb
> left outer join table3 G on a.tsannb = G.IXANNB,
> table4 b,
> table5 c,
> table6 e,
> table7 f
> WHERE
> a.tsgicd >= '01 '
> and (a.tsgicd <= '7999999' or
> a.tsgicd = '8860686' or
> a.tsgicd = '8860694' or
> a.tsgicd = '8860306' or
> a.tsgicd = '8860314' or
> a.tsgicd = '8860520' or
> a.tsgicd = '8860538' or
> a.tsgicd = '8860801' or
> a.tsgicd = '8860819')
> and a.tsgicd = b.acgicd
> and b.acbodc = 0
> and b.acd2st <> 'D'
> and b.acbwcd = c.orrept
> and c.ordivi = 'ESCPT'
> and a.tsannb = e.tiannb
> and e.ticaid = f.caid
>
> I know that the problem comes from the utilisation of the Max function.
> Instead of using the max function, I wanted the subselect to do
> something like this:
>
> (select first(main.bvmnt) from vmdprdta.gnbvfm00 main where
> main.bv#cpte =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff
> desc )
>
> OR
>
> (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
> =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc fetch
> first row only)
>
> OR
>
> (select (main.bvmnt) from vmdprdta.gnbvfm00 main where main.bv#cpte
> =A.TSGICD and main.bvsecid=A.TSA5CD order by main.bvdteeff desc limit
> 1)
>
> Unfortunately, the Order by clause is not permitted in a subselect,
> First doesn't exist in DB2 and Fetch first x-rows is not allowed in a
> subselect...
>
> Does anyone have a solution?
>
> Thanks in advance.
>

Which version/platform of DB2?
DB2 V8 for LUW supports both ORDER By and FETCH FIRST is subqueries
strating with FP2 I believe.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
DB2 UDB for Linux, Unix, Windows
IBM Toronto Lab
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-27-2008, 05:41 AM
Wenbin
 
Posts: n/a
Default Re: DB2 subselect problem

OR predicates can be rewritten to an IN predicate by DB2, if they are
in where clause but not in CASE statement. IN predicate is supposed to
be more efficient than multiple ORs in that it calculates substr only
once.

Also check the explain if inner join predicates and local predicates on
table a, b and c are pushed down below outer join. If not, you may move
all or some WHERE predicates to the FROM clause, because they would
prune off many rows prior to the outjoin computation.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-27-2008, 05:42 AM
Chris
 
Posts: n/a
Default Re: DB2 subselect problem

I doubt that the MAX() is the problem. What seems more likely to me
(not having a database to check the plan on this query) is that the
problem would be the nested - nested correlated subqueries that you
have setup. I would suggest making sure the table vmdprdta.gnbvfm00 is
indexed ideally for the subqueries on the two columns you are querying
on. Or, maybe even better, rewrite the query to not use any correlated
subqueries at all. Here is an example that should work (it also
incorporates the suggestions of using the IN clause, too):

with not_subselect(bvmnt, row_num) as (select bvmnt,
row_number() over(partition by bv#cpte, bvsecid order by bvdteef
desc) as row_num),
not_sub_2(bvmnt) as (select bvmnt from not_subselect where row_num
= 1)
SELECT
a.tsgicd as ACCT_ID,
a.tsa5cd as SEC_ID,
CASE WHEN SUBSTRING(a.tsgicd, 6, 1) in ('R' , 'Y', '0', '1', '2', '3',

'4', '5', '6', '7', '8', '9')
THEN ns2.bvmnt
ELSE NULL END as bv_mnt,
a.tsalpc as LOAN_RATE,
a.tsbpva as LOAN_VALUE,
a.tsbqva as MKT_VALUE
FROM
table1 a
left outer join table2 d on a.tsannb = d.itannb
left outer join table3 G on a.tsannb =
G.IXANNB,
table4 b,
table5 c,
table6 e,
table7 f,
left outer join not_sub_2 ns2 on ns2.bv#cpte = A.TSGICD and
ns2.bvsecid = A.TSA5CD
WHERE
(a.tsgicd between '01 ' and '7999999'
or a.tsgicd in ('8860686', '8860694', '8860306', '8860314',
'8860520',
'8860538', '8860801', '8860819'))
and a.tsgicd = b.acgicd
and b.acbodc = 0
and b.acd2st <> 'D'
and b.acbwcd = c.orrept
and c.ordivi = 'ESCPT'
and a.tsannb = e.tiannb
and e.ticaid = f.caid

Chris

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:02 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com