vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have a udf which returns a table. I was hoping to access the NEXTVAL from a sequence to affect the logic flow and consequently the return result. I'm using a LANGUAGE C type function. Here's the fetch snippet case SQLUDF_TF_FETCH: /* fetch next row */ { char * nextid = myrecids++; char * ptr; --pScratArea->recids_len; if (pScratArea->recids_len < 1) { /* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */ strcpy(outRecid, ""); strcpy(SQLUDF_STATE, "02000"); break; } // look for AM and terminate nextid for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len; *(ptr) = '\0'; myrecids = ptr + 1; // copy current null terminated ptr to outRecid (return arg) strcpy(outRecid, nextid); } *recidNullInd = 0; /* next row of data */ pScratArea->file_pos++; break; What I'm hoping to do is use the result of a NEXTVAL call to cross check against a counter in my scratch pad such that multiple process can be feeding off this function table each row from the function table would only be processed once. |
| |||
| pfa wrote: > I have a udf which returns a table. I was hoping to access the NEXTVAL > from a sequence to affect the logic flow and consequently the return > result. I'm using a LANGUAGE C type function. > > Here's the fetch snippet > > case SQLUDF_TF_FETCH: > /* fetch next row */ > { > char * nextid = myrecids++; > char * ptr; > --pScratArea->recids_len; > if (pScratArea->recids_len < 1) > { > /* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */ > strcpy(outRecid, ""); > strcpy(SQLUDF_STATE, "02000"); > break; > } > // look for AM and terminate nextid > for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len; > *(ptr) = '\0'; > myrecids = ptr + 1; > > // copy current null terminated ptr to outRecid (return arg) > strcpy(outRecid, nextid); > } > > *recidNullInd = 0; > /* next row of data */ > pScratArea->file_pos++; > break; > > > What I'm hoping to do is use the result of a NEXTVAL call to cross > check against a counter in my scratch pad such that multiple process > can be feeding off this function table each row from the function table > would only be processed once. I still don't quite understand your scenario, but you could use embedded SQL and simply query the sequence that way. You have to register the UDF with READS SQL DATA, however. -- Knut Stolze Information Integration Development IBM Germany / University of Jena |
| |||
| Hmmm ok, can't be done via CLI? i.e. are there any handles available? Knut Stolze wrote: > pfa wrote: > > > I have a udf which returns a table. I was hoping to access the NEXTVAL > > from a sequence to affect the logic flow and consequently the return > > result. I'm using a LANGUAGE C type function. > > > > Here's the fetch snippet > > > > case SQLUDF_TF_FETCH: > > /* fetch next row */ > > { > > char * nextid = myrecids++; > > char * ptr; > > --pScratArea->recids_len; > > if (pScratArea->recids_len < 1) > > { > > /* SQLUDF_STATE is part of SQLUDF_TRAIL_ARGS_ALL */ > > strcpy(outRecid, ""); > > strcpy(SQLUDF_STATE, "02000"); > > break; > > } > > // look for AM and terminate nextid > > for (ptr = nextid; *ptr != '\376'; ++ptr) --pScratArea->recids_len; > > *(ptr) = '\0'; > > myrecids = ptr + 1; > > > > // copy current null terminated ptr to outRecid (return arg) > > strcpy(outRecid, nextid); > > } > > > > *recidNullInd = 0; > > /* next row of data */ > > pScratArea->file_pos++; > > break; > > > > > > What I'm hoping to do is use the result of a NEXTVAL call to cross > > check against a counter in my scratch pad such that multiple process > > can be feeding off this function table each row from the function table > > would only be processed once. > > I still don't quite understand your scenario, but you could use embedded SQL > and simply query the sequence that way. You have to register the UDF with > READS SQL DATA, however. > > -- > Knut Stolze > Information Integration Development > IBM Germany / University of Jena |
| |||
| Perhaps there's an alternative way to achieve my end result. I'll try to explain better: We have a program which "programmatically" builds a selection of keys to be processed at a later date by multiple processes run in parallel. As this list is "potentially" too big to be used in a WHERE key IN (...) I figured a function table which extracts each key and returns it as a row would work better: e.g. SELECT key u, data_column t from udftable('listfile') u, table t WHERE u.key = t.key (btw 'listfile' is currently a file on the OS which the udftable reads in and scans through looking for delimiters end returning a null terminated char * as the key, keeping the position in the list for the next fetch) Each process run in parallel feeds of the above SELECT (now this is a work in progress project so may not be the best way to go...we are not DB2 experts) and processes the data_column contents in some way. So the catch is that each row in "table t" must only be processed once so a got the idea of using a NEXTVAL sequence from DB2 so that each process when issuing a FETCH would get a unique key u, data_column t result because the udftable's function would skip keys in the list until the NEXTVAL position has been reached. The only other way I could see to do this was having a separate process (MQ series perhaps? never used it) which has this cursor and the other processes would fetch from it ensuring each row is only processed once. Splitting the list up is not an option as the number of processes started is up to the customer and they can start more after the others are already running. |
| |||
| pfa wrote: > Perhaps there's an alternative way to achieve my end result. I'll try > to explain better: > > We have a program which "programmatically" builds a selection of keys > to be processed at a later date by multiple processes run in parallel. > As this list is "potentially" too big to be used in a WHERE key IN > (...) I figured a function table which extracts each key and returns it > as a row would work better: > > e.g. SELECT key u, data_column t from udftable('listfile') u, table t > WHERE u.key = t.key > > (btw 'listfile' is currently a file on the OS which the udftable reads > in and scans through looking for delimiters end returning a null > terminated char * as the key, keeping the position in the list for the > next fetch) > > Each process run in parallel feeds of the above SELECT (now this is a > work in progress project so may not be the best way to go...we are not > DB2 experts) and processes the data_column contents in some way. So the > catch is that each row in "table t" must only be processed once so a > got the idea of using a NEXTVAL sequence from DB2 so that each process > when issuing a FETCH would get a unique key u, data_column t result > because the udftable's function would skip keys in the list until the > NEXTVAL position has been reached. > > The only other way I could see to do this was having a separate process > (MQ series perhaps? never used it) which has this cursor and the other > processes would fetch from it ensuring each row is only processed once. > > Splitting the list up is not an option as the number of processes > started is up to the customer and they can start more after the others > are already running. > The solution of the table-fucntion is a no go. Check out the NO PARALLEL option - which happens to be mandatory. You will not achieve the parallelizing (sp?) effect you plan to achieve. If DB2 (hypothetically) were to support "partitioned" (which you want) or "replicated" (which you don't want) table functions. The likely way to achieve your goal would lie in an extension of the DBINFO structure with the partition number. So you could partition the file. Now, the problem you are facing in not new by any means. It often appears during data cleansing in a warehouse. In these cases scripts ar eused to fire the same query from each data node (using the DB2NODE export variable to connect) and pass the db partition number as an argument to the table function. It's not as pretty as having teh optimizer do the job, but it works quite well. Cheers Serge -- Serge Rielau DB2 SQL Compiler Development IBM Toronto Lab |
| |||
| pfa wrote: > Hmmm ok, can't be done via CLI? i.e. are there any handles available? You can use CLI. There is a description in the Application Development Guide that explains how to obtain the (default) connection handle inside the UDF for the current connection. -- Knut Stolze Information Integration Development IBM Germany / University of Jena |
| |||
| pfa wrote: > Perhaps there's an alternative way to achieve my end result. I'll try > to explain better: > > We have a program which "programmatically" builds a selection of keys > to be processed at a later date by multiple processes run in parallel. > As this list is "potentially" too big to be used in a WHERE key IN > (...) I figured a function table which extracts each key and returns it > as a row would work better: > > e.g. SELECT key u, data_column t from udftable('listfile') u, table t > WHERE u.key = t.key > > (btw 'listfile' is currently a file on the OS which the udftable reads > in and scans through looking for delimiters end returning a null > terminated char * as the key, keeping the position in the list for the > next fetch) > > Each process run in parallel feeds of the above SELECT (now this is a > work in progress project so may not be the best way to go...we are not > DB2 experts) and processes the data_column contents in some way. So the > catch is that each row in "table t" must only be processed once so a > got the idea of using a NEXTVAL sequence from DB2 so that each process > when issuing a FETCH would get a unique key u, data_column t result > because the udftable's function would skip keys in the list until the > NEXTVAL position has been reached. > > The only other way I could see to do this was having a separate process > (MQ series perhaps? never used it) which has this cursor and the other > processes would fetch from it ensuring each row is only processed once. > > Splitting the list up is not an option as the number of processes > started is up to the customer and they can start more after the others > are already running. My first idea would have been to use a temp table. You populate the table once and then the different processes could use sequences to coordinate which process operates on which rows. However, that really depends on the functionality of the table function and copying its results into a temp table might not be preferred and Serge's suggestion the way to go. -- Knut Stolze Information Integration Development IBM Germany / University of Jena |
| ||||
| Did you consider having the process that creates the keys generate separate files for each of the "worker" processes? It would mean you'd need a separate procedure to start each of the multiple tasks, or use a parameter which will be [part of] the file name. Extending this to a database should be possible by setting up a control table with an identifier for each of the multiple tasks. A predicate indicating the task id will separate out the rows for processing. Either of these should work with "a large number of rows to process" and have all of the tasks complete in around the same amount of time. If you truely need to have the tasks process on a "do the next input key" sequence then the following MIGHT be a way to do this: Create your keys table each day with a column containing a numeric key starting with 1. You'll need a unique index on the numeric key column. After creating the keys table, create a control table having an autogenerated identity column starting with 1. This table needs no columns other than the identity column and should not be indexed. It must have row level locking. Each task inserts a row into the control table then retrieves the identity key for the inserted row. This key is used to do a single row retrieval from the keys table. This keys table retrieval should include an "OPTIMIZE FOR 1 ROWS" clause. Commits must be taken at appropriate intervals to avoid lock escalation on the control table. Phil Sherman pfa wrote: > Perhaps there's an alternative way to achieve my end result. I'll try > to explain better: > > We have a program which "programmatically" builds a selection of keys > to be processed at a later date by multiple processes run in parallel. > As this list is "potentially" too big to be used in a WHERE key IN > (...) I figured a function table which extracts each key and returns it > as a row would work better: > > e.g. SELECT key u, data_column t from udftable('listfile') u, table t > WHERE u.key = t.key > > (btw 'listfile' is currently a file on the OS which the udftable reads > in and scans through looking for delimiters end returning a null > terminated char * as the key, keeping the position in the list for the > next fetch) > > Each process run in parallel feeds of the above SELECT (now this is a > work in progress project so may not be the best way to go...we are not > DB2 experts) and processes the data_column contents in some way. So the > catch is that each row in "table t" must only be processed once so a > got the idea of using a NEXTVAL sequence from DB2 so that each process > when issuing a FETCH would get a unique key u, data_column t result > because the udftable's function would skip keys in the list until the > NEXTVAL position has been reached. > > The only other way I could see to do this was having a separate process > (MQ series perhaps? never used it) which has this cursor and the other > processes would fetch from it ensuring each row is only processed once. > > Splitting the list up is not an option as the number of processes > started is up to the customer and they can start more after the others > are already running. > |
| Thread Tools | |
| Display Modes | |
|
|