View Single Post

   
  #5 (permalink)  
Old 02-27-2008, 04:11 AM
Serge Rielau
 
Posts: n/a
Default Re: using nextval in external udf

pfa wrote:
> Perhaps there's an alternative way to achieve my end result. I'll try
> to explain better:
>
> We have a program which "programmatically" builds a selection of keys
> to be processed at a later date by multiple processes run in parallel.
> As this list is "potentially" too big to be used in a WHERE key IN
> (...) I figured a function table which extracts each key and returns it
> as a row would work better:
>
> e.g. SELECT key u, data_column t from udftable('listfile') u, table t
> WHERE u.key = t.key
>
> (btw 'listfile' is currently a file on the OS which the udftable reads
> in and scans through looking for delimiters end returning a null
> terminated char * as the key, keeping the position in the list for the
> next fetch)
>
> Each process run in parallel feeds of the above SELECT (now this is a
> work in progress project so may not be the best way to go...we are not
> DB2 experts) and processes the data_column contents in some way. So the
> catch is that each row in "table t" must only be processed once so a
> got the idea of using a NEXTVAL sequence from DB2 so that each process
> when issuing a FETCH would get a unique key u, data_column t result
> because the udftable's function would skip keys in the list until the
> NEXTVAL position has been reached.
>
> The only other way I could see to do this was having a separate process
> (MQ series perhaps? never used it) which has this cursor and the other
> processes would fetch from it ensuring each row is only processed once.
>
> Splitting the list up is not an option as the number of processes
> started is up to the customer and they can start more after the others
> are already running.
>

The solution of the table-fucntion is a no go. Check out the NO PARALLEL
option - which happens to be mandatory.
You will not achieve the parallelizing (sp?) effect you plan to achieve.
If DB2 (hypothetically) were to support "partitioned" (which you want)
or "replicated" (which you don't want) table functions. The likely way
to achieve your goal would lie in an extension of the DBINFO structure
with the partition number. So you could partition the file.

Now, the problem you are facing in not new by any means. It often
appears during data cleansing in a warehouse.
In these cases scripts ar eused to fire the same query from each data
node (using the DB2NODE export variable to connect) and pass the db
partition number as an argument to the table function.
It's not as pretty as having teh optimizer do the job, but it works
quite well.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Reply With Quote