vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Friends: In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. What I want to do is assign all equivalent IDs to the same group number, including those that are transitively related, i.e., if A = B and B = C then A = C, so I'd group all three together. Although they're not related in a composition fashion per se, it seems like the way to go conceptually is to consider the ID relationships as a reporting tree (ID_A would be the manager and ID_B would be the employee) and assign all IDs that share the same root to the same group. For instance, let's say I have the following pairs ID_A ID_B ---- ---- 1800 1804 1800 1808 1806 1809 1808 1810 1808 1812 1809 1815 1810 1811 I'd have two trees (sideways): 1800 1804 1808 1810 1811 1812 and 1806 1809 1815 I'm struggling with the following: 1. How to group based on a shared *root* (I'd hate to have to build a chain column, e.g., for 1811: "1800-->1808-->1810" and do something like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems unreliable, and the ID is not always the same length) 2. How to write a recursive CTE that accomodates multiple, independent, trees. What I'd like to end up with is this: ID GRP ---- --- 1800 1 1804 1 1808 1 1810 1 1811 1 1812 1 1806 2 1809 2 1815 2 I feel like I'm close--I've read Serge's "CONNECT BY" article and Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but I'm just not able to stitch it all together. Would anyone care to lend a hand? Thanks, --Jeff |
| |||
| I should have mentioned that by "equivalent IDs," I meant logically equivalent, i.e., they identify the same thing. Also, to view my tables and "trees," please click fixed-font. --Jeff jefftyzzer wrote: > Friends: > > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > What I want to do is assign all equivalent IDs to the same group > number, including those that are transitively related, i.e., if A = B > and B = C then A = C, so I'd group all three together. > > Although they're not related in a composition fashion per se, it seems > like the way to go conceptually is to consider the ID relationships as > a reporting tree (ID_A would be the manager and ID_B would be the > employee) and assign all IDs that share the same root to the same > group. > > For instance, let's say I have the following pairs > > ID_A ID_B > ---- ---- > 1800 1804 > 1800 1808 > 1806 1809 > 1808 1810 > 1808 1812 > 1809 1815 > 1810 1811 > > I'd have two trees (sideways): > > 1800 1804 > 1808 > 1810 > 1811 > 1812 > and > > 1806 > 1809 > 1815 > > I'm struggling with the following: > > 1. How to group based on a shared *root* (I'd hate to have to build a > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > unreliable, and the ID is not always the same length) > > 2. How to write a recursive CTE that accomodates multiple, independent, > trees. > > What I'd like to end up with is this: > > ID GRP > ---- --- > 1800 1 > 1804 1 > 1808 1 > 1810 1 > 1811 1 > 1812 1 > 1806 2 > 1809 2 > 1815 2 > > I feel like I'm close--I've read Serge's "CONNECT BY" article and > Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but > I'm just not able to stitch it all together. > > Would anyone care to lend a hand? > > Thanks, > > --Jeff |
| |||
| Hello. Try this: -------------- with a(ID_A, ID_B) as ( values (1800, 1804), (1800, 1808), (1806, 1809), (1808, 1810), (1808, 1812), (1809, 1815), (1810, 1811) ), t(level, id_a, id_b, chain) as ( select distinct 1, id_a, id_a, cast(digits(id_a) as varchar(50)) from a where not exists ( select 1 from a a2 where a2.id_b=a.id_a ) UNION ALL select t.level+1, t.id_a, a.id_b, t.chain||digits(a.id_b) from a, t where a.id_a=t.id_b ) select dense_rank() over(order by id_a) as grp, substr( repeat(' ', level-1)|| char(case level when 1 then id_a else id_b end) ,1,20) --, level, chain from t order by id_a, chain; -------------- > Friends: > > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > What I want to do is assign all equivalent IDs to the same group > number, including those that are transitively related, i.e., if A = B > and B = C then A = C, so I'd group all three together. > > Although they're not related in a composition fashion per se, it seems > like the way to go conceptually is to consider the ID relationships as > a reporting tree (ID_A would be the manager and ID_B would be the > employee) and assign all IDs that share the same root to the same > group. > > For instance, let's say I have the following pairs > > ID_A ID_B > ---- ---- > 1800 1804 > 1800 1808 > 1806 1809 > 1808 1810 > 1808 1812 > 1809 1815 > 1810 1811 > > I'd have two trees (sideways): > > 1800 1804 > 1808 > 1810 > 1811 > 1812 > and > > 1806 > 1809 > 1815 > > I'm struggling with the following: > > 1. How to group based on a shared *root* (I'd hate to have to build a > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > unreliable, and the ID is not always the same length) > > 2. How to write a recursive CTE that accomodates multiple, independent, > trees. > > What I'd like to end up with is this: > > ID GRP > ---- --- > 1800 1 > 1804 1 > 1808 1 > 1810 1 > 1811 1 > 1812 1 > 1806 2 > 1809 2 > 1815 2 > > I feel like I'm close--I've read Serge's "CONNECT BY" article and > Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but > I'm just not able to stitch it all together. > > Would anyone care to lend a hand? > > Thanks, > > --Jeff |
| |||
| Brilliant! Thank you, kind stanger :-) 4.spam@mail.ru wrote: > Hello. > Try this: > -------------- > with a(ID_A, ID_B) as > ( > values > (1800, 1804), > (1800, 1808), > (1806, 1809), > (1808, 1810), > (1808, 1812), > (1809, 1815), > (1810, 1811) > ), t(level, id_a, id_b, chain) as > ( > select distinct 1, id_a, id_a, cast(digits(id_a) as varchar(50)) > from a > where not exists > ( > select 1 > from a a2 > where a2.id_b=a.id_a > ) > UNION ALL > select t.level+1, t.id_a, a.id_b, t.chain||digits(a.id_b) > from a, t > where a.id_a=t.id_b > ) > select dense_rank() over(order by id_a) as grp, > substr( > repeat(' ', level-1)|| > char(case level when 1 then id_a else id_b end) > ,1,20) > --, level, chain > from t > order by id_a, chain; > -------------- > > > Friends: > > > > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > > What I want to do is assign all equivalent IDs to the same group > > number, including those that are transitively related, i.e., if A = B > > and B = C then A = C, so I'd group all three together. > > > > Although they're not related in a composition fashion per se, it seems > > like the way to go conceptually is to consider the ID relationships as > > a reporting tree (ID_A would be the manager and ID_B would be the > > employee) and assign all IDs that share the same root to the same > > group. > > > > For instance, let's say I have the following pairs > > > > ID_A ID_B > > ---- ---- > > 1800 1804 > > 1800 1808 > > 1806 1809 > > 1808 1810 > > 1808 1812 > > 1809 1815 > > 1810 1811 > > > > I'd have two trees (sideways): > > > > 1800 1804 > > 1808 > > 1810 > > 1811 > > 1812 > > and > > > > 1806 > > 1809 > > 1815 > > > > I'm struggling with the following: > > > > 1. How to group based on a shared *root* (I'd hate to have to build a > > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > > unreliable, and the ID is not always the same length) > > > > 2. How to write a recursive CTE that accomodates multiple, independent, > > trees. > > > > What I'd like to end up with is this: > > > > ID GRP > > ---- --- > > 1800 1 > > 1804 1 > > 1808 1 > > 1810 1 > > 1811 1 > > 1812 1 > > 1806 2 > > 1809 2 > > 1815 2 > > > > I feel like I'm close--I've read Serge's "CONNECT BY" article and > > Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but > > I'm just not able to stitch it all together. > > > > Would anyone care to lend a hand? > > > > Thanks, > > > > --Jeff |
| |||
| Brilliant! Thank you, kind stanger :-) 4.spam@mail.ru wrote: > Hello. > Try this: > -------------- > with a(ID_A, ID_B) as > ( > values > (1800, 1804), > (1800, 1808), > (1806, 1809), > (1808, 1810), > (1808, 1812), > (1809, 1815), > (1810, 1811) > ), t(level, id_a, id_b, chain) as > ( > select distinct 1, id_a, id_a, cast(digits(id_a) as varchar(50)) > from a > where not exists > ( > select 1 > from a a2 > where a2.id_b=a.id_a > ) > UNION ALL > select t.level+1, t.id_a, a.id_b, t.chain||digits(a.id_b) > from a, t > where a.id_a=t.id_b > ) > select dense_rank() over(order by id_a) as grp, > substr( > repeat(' ', level-1)|| > char(case level when 1 then id_a else id_b end) > ,1,20) > --, level, chain > from t > order by id_a, chain; > -------------- > > > Friends: > > > > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > > What I want to do is assign all equivalent IDs to the same group > > number, including those that are transitively related, i.e., if A = B > > and B = C then A = C, so I'd group all three together. > > > > Although they're not related in a composition fashion per se, it seems > > like the way to go conceptually is to consider the ID relationships as > > a reporting tree (ID_A would be the manager and ID_B would be the > > employee) and assign all IDs that share the same root to the same > > group. > > > > For instance, let's say I have the following pairs > > > > ID_A ID_B > > ---- ---- > > 1800 1804 > > 1800 1808 > > 1806 1809 > > 1808 1810 > > 1808 1812 > > 1809 1815 > > 1810 1811 > > > > I'd have two trees (sideways): > > > > 1800 1804 > > 1808 > > 1810 > > 1811 > > 1812 > > and > > > > 1806 > > 1809 > > 1815 > > > > I'm struggling with the following: > > > > 1. How to group based on a shared *root* (I'd hate to have to build a > > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > > unreliable, and the ID is not always the same length) > > > > 2. How to write a recursive CTE that accomodates multiple, independent, > > trees. > > > > What I'd like to end up with is this: > > > > ID GRP > > ---- --- > > 1800 1 > > 1804 1 > > 1808 1 > > 1810 1 > > 1811 1 > > 1812 1 > > 1806 2 > > 1809 2 > > 1815 2 > > > > I feel like I'm close--I've read Serge's "CONNECT BY" article and > > Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but > > I'm just not able to stitch it all together. > > > > Would anyone care to lend a hand? > > > > Thanks, > > > > --Jeff |
| |||
| Brilliant! Thank you, kind stranger :-) 4.spam@mail.ru wrote: > Hello. > Try this: > -------------- > with a(ID_A, ID_B) as > ( > values > (1800, 1804), > (1800, 1808), > (1806, 1809), > (1808, 1810), > (1808, 1812), > (1809, 1815), > (1810, 1811) > ), t(level, id_a, id_b, chain) as > ( > select distinct 1, id_a, id_a, cast(digits(id_a) as varchar(50)) > from a > where not exists > ( > select 1 > from a a2 > where a2.id_b=a.id_a > ) > UNION ALL > select t.level+1, t.id_a, a.id_b, t.chain||digits(a.id_b) > from a, t > where a.id_a=t.id_b > ) > select dense_rank() over(order by id_a) as grp, > substr( > repeat(' ', level-1)|| > char(case level when 1 then id_a else id_b end) > ,1,20) > --, level, chain > from t > order by id_a, chain; > -------------- > > > Friends: > > > > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > > What I want to do is assign all equivalent IDs to the same group > > number, including those that are transitively related, i.e., if A = B > > and B = C then A = C, so I'd group all three together. > > > > Although they're not related in a composition fashion per se, it seems > > like the way to go conceptually is to consider the ID relationships as > > a reporting tree (ID_A would be the manager and ID_B would be the > > employee) and assign all IDs that share the same root to the same > > group. > > > > For instance, let's say I have the following pairs > > > > ID_A ID_B > > ---- ---- > > 1800 1804 > > 1800 1808 > > 1806 1809 > > 1808 1810 > > 1808 1812 > > 1809 1815 > > 1810 1811 > > > > I'd have two trees (sideways): > > > > 1800 1804 > > 1808 > > 1810 > > 1811 > > 1812 > > and > > > > 1806 > > 1809 > > 1815 > > > > I'm struggling with the following: > > > > 1. How to group based on a shared *root* (I'd hate to have to build a > > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > > unreliable, and the ID is not always the same length) > > > > 2. How to write a recursive CTE that accomodates multiple, independent, > > trees. > > > > What I'd like to end up with is this: > > > > ID GRP > > ---- --- > > 1800 1 > > 1804 1 > > 1808 1 > > 1810 1 > > 1811 1 > > 1812 1 > > 1806 2 > > 1809 2 > > 1815 2 > > > > I feel like I'm close--I've read Serge's "CONNECT BY" article and > > Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but > > I'm just not able to stitch it all together. > > > > Would anyone care to lend a hand? > > > > Thanks, > > > > --Jeff |
| |||
| In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. What I want to do is assign all equivalent IDs to the same group number, including those that are transitively related, i.e., if A = B and B = C then A = C, so I'd group all three together. Although they're not related in a composition fashion per se, it seems like the way to go conceptually is to consider the ID relationships as a reporting tree (ID_A would be the manager and ID_B would be the employee) and assign all IDs that share the same root to the same group. For instance, let's say I have the following pairs ID_A ID_B ---- ---- 1800 1804 1800 1808 1806 1809 1808 1810 1808 1812 1809 1815 1810 1811 I'd have two trees (sideways): 1800 1804 1808 1810 1811 1812 and 1806 1809 1815 I'm struggling with the following: 1. How to group based on a shared *root* (I'd hate to have to build a chain column, e.g., for 1811: "1800-->1808-->1810" and do something like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems unreliable, and the ID is not always the same length) 2. How to write a recursive CTE that accomodates multiple, independent, trees. >> I feel like I'm close--I've read Serge's "CONNECT BY" article and Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but I'm just not able to stitch it all together. << You should have been reading Celko's TREES & HIERARCHIES IN SQL instead What you have is a forest, not a tree and you can do this with a nested sets model CREATE TABLE FoobarForest (foobar_id INTEGER NOT NULL PRIMARY KEY, --assumption tree_nbr INTEGER NOT NULL, lft INTEGER NOT NULL CHECK (lft > 0), rgt INTEGER NOT NULL, CHECK (lft < rgt), UNIQUE (tree_nbr, lft)); Instead of a recursive CTE, pick the roots and use a simple push-down stack for tree traversal. Here is a SQL/PSM version for one tree: - Tree holds the adjacency model CREATE TABLE Tree (node CHAR(10) NOT NULL, parent CHAR(10)); -- Stack starts empty, will holds the nested set model CREATE TABLE Stack (stack_top INTEGER NOT NULL, node CHAR(10) NOT NULL, lft INTEGER, rgt INTEGER); CREATE PROCEDURE TreeTraversal () LANGUAGE SQL DETERMINISTIC BEGIN ATOMIC DECLARE counter INTEGER; DECLARE max_counter INTEGER; DECLARE current_top INTEGER; SET counter = 2; SET max_counter = 2 * (SELECT COUNT(*) FROM Tree); SET current_top = 1; --clear the stack DELETE FROM Stack; -- push the root INSERT INTO Stack SELECT 1, node, 1, max_counter FROM Tree WHERE parent IS NULL; -- delete rows from tree as they are used DELETE FROM Tree WHERE parent IS NULL; WHILE counter <= max_counter- 1 DO IF EXISTS (SELECT * FROM Stack AS S1, Tree AS T1 WHERE S1.node = T1.parent AND S1.stack_top = current_top) THEN BEGIN -- push when top has subordinates and set lft value INSERT INTO Stack SELECT (current_top + 1), MIN(T1.node), counter, NULL FROM Stack AS S1, Tree AS T1 WHERE S1.node = T1.parent AND S1.stack_top = current_top; -- delete rows from tree as they are used DELETE FROM Tree WHERE node = (SELECT node FROM Stack WHERE stack_top = current_top + 1); -- housekeeping of stack pointers and counter SET counter = counter + 1; SET current_top = current_top + 1; END; ELSE BEGIN -- pop the stack and set rgt value UPDATE Stack SET rgt = counter, stack_top = -stack_top -- pops the stack WHERE stack_top = current_top; SET counter = counter + 1; SET current_top = current_top - 1; END; END IF; END WHILE; -- SELECT node, lft, rgt FROM Stack; -- the top column is not needed in the final answer -- move stack contents to new tree table END; |
| ||||
| Thank you, Joe--I look forward to digging into your suggestion, and appreciate your considered reply to my posting. --Jeff --CELKO-- wrote: > In a DB2 UDB LUW table, I have a table with pairs of equivalent ID's. > What I want to do is assign all equivalent IDs to the same group > number, including those that are transitively related, i.e., if A = B > and B = C then A = C, so I'd group all three together. > > Although they're not related in a composition fashion per se, it seems > like the way to go conceptually is to consider the ID relationships as > a reporting tree (ID_A would be the manager and ID_B would be the > employee) and assign all IDs that share the same root to the same > group. > > For instance, let's say I have the following pairs > > ID_A ID_B > ---- ---- > 1800 1804 > 1800 1808 > 1806 1809 > 1808 1810 > 1808 1812 > 1809 1815 > 1810 1811 > > I'd have two trees (sideways): > > 1800 1804 > 1808 > 1810 > 1811 > 1812 > and > > 1806 > 1809 > 1815 > > I'm struggling with the following: > > 1. How to group based on a shared *root* (I'd hate to have to build a > chain column, e.g., for 1811: "1800-->1808-->1810" and do something > like DENSE_RANK() OVER(ORDER BY SUBSTR(CHAIN,1,4)--that seems > unreliable, and the ID is not always the same length) > > 2. How to write a recursive CTE that accomodates multiple, > independent, > trees. > > >> I feel like I'm close--I've read Serge's "CONNECT BY" article and Molinaro's chapter "Hierarchical Queries" in his _SQL Cookbook_, but I'm just not able to stitch it all together. << > > You should have been reading Celko's TREES & HIERARCHIES IN SQL instead > > > What you have is a forest, not a tree and you can do this with a nested > sets model > > CREATE TABLE FoobarForest > (foobar_id INTEGER NOT NULL PRIMARY KEY, --assumption > tree_nbr INTEGER NOT NULL, > lft INTEGER NOT NULL CHECK (lft > 0), > rgt INTEGER NOT NULL, > CHECK (lft < rgt), > UNIQUE (tree_nbr, lft)); > > Instead of a recursive CTE, pick the roots and use a simple push-down > stack for tree traversal. Here is a SQL/PSM version for one tree: > > - Tree holds the adjacency model > CREATE TABLE Tree > (node CHAR(10) NOT NULL, > parent CHAR(10)); > > -- Stack starts empty, will holds the nested set model > CREATE TABLE Stack > (stack_top INTEGER NOT NULL, > node CHAR(10) NOT NULL, > lft INTEGER, > rgt INTEGER); > > CREATE PROCEDURE TreeTraversal () > LANGUAGE SQL > DETERMINISTIC > BEGIN ATOMIC > DECLARE counter INTEGER; > DECLARE max_counter INTEGER; > DECLARE current_top INTEGER; > > SET counter = 2; > SET max_counter = 2 * (SELECT COUNT(*) FROM Tree); > SET current_top = 1; > > --clear the stack > DELETE FROM Stack; > > -- push the root > INSERT INTO Stack > SELECT 1, node, 1, max_counter > FROM Tree > WHERE parent IS NULL; > > -- delete rows from tree as they are used > DELETE FROM Tree WHERE parent IS NULL; > > WHILE counter <= max_counter- 1 > DO IF EXISTS (SELECT * > FROM Stack AS S1, Tree AS T1 > WHERE S1.node = T1.parent > AND S1.stack_top = current_top) > THEN BEGIN -- push when top has subordinates and set lft value > INSERT INTO Stack > SELECT (current_top + 1), MIN(T1.node), counter, NULL > FROM Stack AS S1, Tree AS T1 > WHERE S1.node = T1.parent > AND S1.stack_top = current_top; > > -- delete rows from tree as they are used > DELETE FROM Tree > WHERE node = (SELECT node > FROM Stack > WHERE stack_top = current_top + 1); > -- housekeeping of stack pointers and counter > SET counter = counter + 1; > SET current_top = current_top + 1; > END; > ELSE > BEGIN -- pop the stack and set rgt value > UPDATE Stack > SET rgt = counter, > stack_top = -stack_top -- pops the stack > WHERE stack_top = current_top; > SET counter = counter + 1; > SET current_top = current_top - 1; > END; > END IF; > END WHILE; > -- SELECT node, lft, rgt FROM Stack; > -- the top column is not needed in the final answer > -- move stack contents to new tree table > END; |