Unix Technical Forum

Updating data in table depending on data in same table

This is a discussion on Updating data in table depending on data in same table within the Oracle Miscellaneous forums, part of the Oracle Database category; --> Hey everyone, First of all apologies for posting this in the oracle.server section I meant to post it in ...


Go Back   Unix Technical Forum > Database Server Software > Oracle Database > Oracle Miscellaneous

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-08-2008, 11:16 AM
The J Man
 
Posts: n/a
Default Updating data in table depending on data in same table

Hey everyone,

First of all apologies for posting this in the oracle.server section I
meant to post it in here first.

I'm still a relative newb so go easy please!

I've inherited a system from someone that is in a bit of a mess and I
need to tidy it up. It's an Oracle 10g database.

The part i have to clean up basically has two tables.
Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT,
LANGUAGES.
Table B attributes are: SOURCEID, LANGID, ROLE, STATUS.

Now the way this system is supposed to work is that a row is entered
into table A and X amount of languages are added to the LANGUAGES
field. This field is then split up into a row for each language into
table B with the corresponding SOURCEID.
e.g.
Table A:
123 -- Homer -- lazy -- HR -- French;Spanish;German

Table B:
123 -- French -- manager -- approved
123 -- Spanish -- admin -- pending
123 -- German -- teaboy -- proposed

This system had been working fine until a new person started and
instead of just adding a new language to the original row in table A
she added 1000+ rows that are identical to those already in the table
except for the fact the language is Japanese. So now the table looks
like
Table A:
123 -- Homer -- lazy -- HR -- French;Spanish;German
456 -- Homer -- lazy -- HR -- Japanese;

Table B:
123 -- French -- manager -- approved
123 -- Spanish -- admin -- pending
123 -- German -- teaboy -- proposed
456 -- Japanese -- janitor -- approved

I have been given the unenviable task of fixing this by, identifying
all terms that have identical attributes (except for the LANGUAGES
attribute), changing the SOURCEID of the duplicate in table B to the
master term SOURCEID (so in the example above, in table B change the
row 456 -- Japanese -- janitor -- approved to 123 -- Japanese --
janitor -- approved), tagging on Japanese onto the master term (in the
example above, in table A change the row 123 -- Homer -- lazy -- HR --
French;Spanish;German to 123 -- Homer -- lazy -- HR --
French;Spanish;German;Japanese and then deleting the duplicate entry
in table A (in the example above, delete the row 456 -- Homer -- lazy
-- HR -- Japanese

Thanks for sticking with me this far (if you have!). Luckily enough I
know for a fact that all the master terms are in the SOURCEID range 1
to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so
identifying the terms is not a problem. I just ran an intersect query
and found my ~1000 culprit terms. As you probably know though, from
this I am not getting the SOURCEIDS of either row so basically i have
been getting the SOURCEIDS of the identified terms and running the
following command manually.

update target set target.SOURCEID=MASTER_NUMBER where
target.SOURCEID=DUPLICATE_NUMBER;
update source set
source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian
Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French
Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER;
DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER;

As you can tell my wrists are about to fall off with the RSI of
continually trying to find out the IDs and then run these commands.
SQL is not really my game so i am not sure whether i would be able to
use some procedure or cursor or something along those lines... I have
already wasted an awful lot of time trying to read of some way to help
me by running different commands but i am very quickly running out of
time...

Any help at all would be appreciated or even just a big fat no to say
you're gonna have to put up with it and just do it manually.

Thank you reading all this and I hope that I have explained it clearly
enough,
Jonathan.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-08-2008, 11:16 AM
DA Morgan
 
Posts: n/a
Default Re: Updating data in table depending on data in same table

The J Man wrote:
> Hey everyone,
>
> First of all apologies for posting this in the oracle.server section I
> meant to post it in here first.
>
> I'm still a relative newb so go easy please!
>
> I've inherited a system from someone that is in a bit of a mess and I
> need to tidy it up. It's an Oracle 10g database.
>
> The part i have to clean up basically has two tables.
> Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT,
> LANGUAGES.
> Table B attributes are: SOURCEID, LANGID, ROLE, STATUS.
>
> Now the way this system is supposed to work is that a row is entered
> into table A and X amount of languages are added to the LANGUAGES
> field. This field is then split up into a row for each language into
> table B with the corresponding SOURCEID.
> e.g.
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed
>
> This system had been working fine until a new person started and
> instead of just adding a new language to the original row in table A
> she added 1000+ rows that are identical to those already in the table
> except for the fact the language is Japanese. So now the table looks
> like
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
> 456 -- Homer -- lazy -- HR -- Japanese;
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed
> 456 -- Japanese -- janitor -- approved
>
> I have been given the unenviable task of fixing this by, identifying
> all terms that have identical attributes (except for the LANGUAGES
> attribute), changing the SOURCEID of the duplicate in table B to the
> master term SOURCEID (so in the example above, in table B change the
> row 456 -- Japanese -- janitor -- approved to 123 -- Japanese --
> janitor -- approved), tagging on Japanese onto the master term (in the
> example above, in table A change the row 123 -- Homer -- lazy -- HR --
> French;Spanish;German to 123 -- Homer -- lazy -- HR --
> French;Spanish;German;Japanese and then deleting the duplicate entry
> in table A (in the example above, delete the row 456 -- Homer -- lazy
> -- HR -- Japanese
>
> Thanks for sticking with me this far (if you have!). Luckily enough I
> know for a fact that all the master terms are in the SOURCEID range 1
> to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so
> identifying the terms is not a problem. I just ran an intersect query
> and found my ~1000 culprit terms. As you probably know though, from
> this I am not getting the SOURCEIDS of either row so basically i have
> been getting the SOURCEIDS of the identified terms and running the
> following command manually.
>
> update target set target.SOURCEID=MASTER_NUMBER where
> target.SOURCEID=DUPLICATE_NUMBER;
> update source set
> source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian
> Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French
> Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER;
> DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER;
>
> As you can tell my wrists are about to fall off with the RSI of
> continually trying to find out the IDs and then run these commands.
> SQL is not really my game so i am not sure whether i would be able to
> use some procedure or cursor or something along those lines... I have
> already wasted an awful lot of time trying to read of some way to help
> me by running different commands but i am very quickly running out of
> time...
>
> Any help at all would be appreciated or even just a big fat no to say
> you're gonna have to put up with it and just do it manually.
>
> Thank you reading all this and I hope that I have explained it clearly
> enough,
> Jonathan.


What you have described above, if I understand it correctly, is a
relational horror story. What happened should be technically impossible.
While you certainly need to clean up the mess the LANGUAGES column
serves no useful purpose as the information is available, properly
stored in table B.

Also be aware that some of your column names are Oracle reserved words.

NAME and ROLE jump immediately to mind.

But this should start you off:

SELECT col1, col2, col3, col4, COUNT(*)
FROM table
GROUP BY col1, col2, col3, col4
HAVING COUNT(*) > 1;
--
Daniel A. Morgan
University of Washington
damorgan@x.washington.edu
(replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-08-2008, 11:16 AM
Ed Prochak
 
Posts: n/a
Default Re: Updating data in table depending on data in same table

On Feb 8, 2:14 pm, "The J Man" <jonathan.kee...@gmail.com> wrote:
> Hey everyone,
>
> First of all apologies for posting this in the oracle.server section I
> meant to post it in here first.
>
> I'm still a relative newb so go easy please!
>
> I've inherited a system from someone that is in a bit of a mess and I
> need to tidy it up. It's an Oracle 10g database.
>
> The part i have to clean up basically has two tables.
> Table A attributes are: SOURCEID, NAME, DESCRIPTION, DEPARTMENT,
> LANGUAGES.
> Table B attributes are: SOURCEID, LANGID, ROLE, STATUS.
>
> Now the way this system is supposed to work is that a row is entered
> into table A and X amount of languages are added to the LANGUAGES
> field. This field is then split up into a row for each language into
> table B with the corresponding SOURCEID.
> e.g.
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed


it is not clear how B gets populated from A. Table A doesn't seems
like a staging table as I first thought. There is obviously more going
on in the process. So is it manual or automated? What are the rules?
>
> This system had been working fine until a new person started and
> instead of just adding a new language to the original row in table A
> she added 1000+ rows that are identical to those already in the table
> except for the fact the language is Japanese. So now the table looks
> like
> Table A:
> 123 -- Homer -- lazy -- HR -- French;Spanish;German
> 456 -- Homer -- lazy -- HR -- Japanese;
>
> Table B:
> 123 -- French -- manager -- approved
> 123 -- Spanish -- admin -- pending
> 123 -- German -- teaboy -- proposed
> 456 -- Japanese -- janitor -- approved
>
> I have been given the unenviable task of fixing this


The first error this shows is that using an ID column as the PK buys
you ZERO data integrity.
The second error this shows is tha application fails to find the
related data in table A.
yes you have a real mess.

So go to your backup from the night before and start over. reload
those two tables and tell the entry person to do it right this time
(it's actually not her fault, but she still has to reenter any data
from that day.) You do do backups don't you?
(By now this solution is too late, you let too much other stuff get
inserted. time for plan B)

> ... by, identifying
> all terms that have identical attributes (except for the LANGUAGES
> attribute),


But the only common attribute is the source ID between table B and
table A.

> ... changing the SOURCEID of the duplicate in table B to the
> master term SOURCEID (so in the example above, in table B change the
> row 456 -- Japanese -- janitor -- approved to 123 -- Japanese --
> janitor -- approved), tagging on Japanese onto the master term (in the
> example above, in table A change the row 123 -- Homer -- lazy -- HR --
> French;Spanish;German to 123 -- Homer -- lazy -- HR --
> French;Spanish;German;Japanese and then deleting the duplicate entry
> in table A (in the example above, delete the row 456 -- Homer -- lazy
> -- HR -- Japanese
>
> Thanks for sticking with me this far (if you have!). Luckily enough I
> know for a fact that all the master terms are in the SOURCEID range 1
> to 2000 and the duplicates are in the SOURCEID range 2500 to 6000. so


so this was either a bulk load, or manual data entry over a period of
time.
in either case the process is flawed.

> identifying the terms is not a problem. I just ran an intersect query
> and found my ~1000 culprit terms. As you probably know though, from
> this I am not getting the SOURCEIDS of either row so basically i have
> been getting the SOURCEIDS of the identified terms and running the
> following command manually.
>
> update target set target.SOURCEID=MASTER_NUMBER where
> target.SOURCEID=DUPLICATE_NUMBER;
> update source set
> source.LANGSREQUIRED='French;Spanish;German;Italia n;Brazilian
> Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French
> Canadian;Dutch;Korean;Japanese;' where source.SOURCEID=MASTER_NUMBER;
> DELETE FROM source WHERE sourceid = DUPLICATE_NUMBER;


Where's table A and table B??? which is which??

>
> As you can tell my wrists are about to fall off with the RSI of
> continually trying to find out the IDs and then run these commands.
> SQL is not really my game so i am not sure whether i would be able to
> use some procedure or cursor or something along those lines... I have
> already wasted an awful lot of time trying to read of some way to help
> me by running different commands but i am very quickly running out of
> time...
>
> Any help at all would be appreciated or even just a big fat no to say
> you're gonna have to put up with it and just do it manually.
>
> Thank you reading all this and I hope that I have explained it clearly
> enough,
> Jonathan.


Solutions:
1. A PL/SQL procedure would be the fastest IF you knew PL/SQL.

Given you feel less than familiar with SQL, might I ask how you got
this assignment? No one else there knows Oracle?

2. If you can do it manually, then why not spool out the list of new/
old ids, and languages (ie just a SELECT query) and edit the file to
make the UPDATE statements you need.

You do know how to use an editor like VI with macro commands like
global substitute, right?
(If you say no to that, then I pity you since you clearly work for an
incompetent manager who assigns people to tasks they cannot pereform.)


long term, either change this application or get another job.

ed








Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-08-2008, 11:16 AM
The J Man
 
Posts: n/a
Default Re: Updating data in table depending on data in same table

On Feb 8, 8:53 pm, "Ed Prochak" <edproc...@gmail.com> wrote:
> On Feb 8, 2:14 pm, "The J Man" <jonathan.kee...@gmail.com> wrote:
>


Thanks for both of the replies...

>
>
> it is not clear how B gets populated from A. Table A doesn't seems
> like a staging table as I first thought. There is obviously more going
> on in the process. So is it manual or automated? What are the rules?
>


Sorry, my explanation was not too clear and i was using sample data
rather that the actual data, my bad.
Table A is basically a list of English terms. Each term has a number
of translations it needs to be translated into. This entry goes
through a few status changes and when it gets to approved a row is
created in table B for each language with the SOURCEID of the English
term. Table A is manually added. Table B is automatically populated
and then manually edited.( i.e. the translation of the actual English
term is added)

> The first error this shows is that using an ID column as the PK buys
> you ZERO data integrity.
> The second error this shows is tha application fails to find the
> related data in table A.
> yes you have a real mess.
>


Well not really, because we may want duplicate entries in the table as
long as they have a valid reason for doing so. This is why this
problem occurred.
When the user enters her row she is warned that an entry for this
already exists and asks her are they sure they want to add the new
row. This is for the scenario where an English word may have different
contexts in different languages. I may have given the wrong impression
of this as i did not clearly explain what i was actually doing, i
tried to ask a generic question.

> So go to your backup from the night before and start over. reload
> those two tables and tell the entry person to do it right this time
> (it's actually not her fault, but she still has to reenter any data
> from that day.) You do do backups don't you?
> (By now this solution is too late, you let too much other stuff get
> inserted. time for plan B)


as you said, too late. Unfortunately.

>
> But the only common attribute is the source ID between table B and
> table A.


I meant common attributes between tuples in table A. So any tuple
within table A that has the same name and description.

>
> so this was either a bulk load, or manual data entry over a period of
> time.
> in either case the process is flawed.
>


It was a manual data entry over time. Could you tell me why it is
flawed?


> > update TARGET set TARGET.SOURCEID=MASTER_NUMBER where
> > TARGET.SOURCEID=DUPLICATE_NUMBER;
> > update SOURCE set
> > SOURCE .LANGSREQUIRED='French;Spanish;German;Italian;Braz ilian
> > Portuguese;Simplified Chinese;Traditional Chinese;Swedish;French
> > Canadian;Dutch;Korean;Japanese;' where SOURCE .SOURCEID=MASTER_NUMBER;
> > DELETE FROM SOURCE WHERE SOURCEID= DUPLICATE_NUMBER;

>
> Where's table A and table B??? which is which??
>


Apologies, i copied that straight from my SQL Developer, table A is
SOURCE and table B is TARGET.

> Solutions:
> 1. A PL/SQL procedure would be the fastest IF you knew PL/SQL.
>


I think i will go down this road. I was thinking of using two cursors,
one to get all entries under 2000, and another to see if there are any
dupes. If there are, update the rows and then fetch the next row.

> Given you feel less than familiar with SQL, might I ask how you got
> this assignment? No one else there knows Oracle?


Bingo. When i said i was less than familiar, i meant apart from
creating tables, triggers, sequences etc. I would have no in depth
experience of using cursors or anything like that. (which is about to
change!)


> You do know how to use an editor like VI with macro commands like
> global substitute, right?
> (If you say no to that, then I pity you since you clearly work for an
> incompetent manager who assigns people to tasks they cannot pereform.)


Yes, i do know how to that. But i do have an incompetent manager.

>
> long term, either change this application or get another job.
>


Thank you for your help,
J.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-08-2008, 11:16 AM
Ed Prochak
 
Posts: n/a
Default Re: Updating data in table depending on data in same table

On Feb 9, 6:21 am, "The J Man" <jonathan.kee...@gmail.com> wrote:
> On Feb 8, 8:53 pm, "Ed Prochak" <edproc...@gmail.com> wrote:
>

[]
> Sorry, my explanation was not too clear and i was using sample data
> rather that the actual data, my bad.
> Table A is basically a list of English terms. Each term has a number
> of translations it needs to be translated into. This entry goes
> through a few status changes and when it gets to approved a row is
> created in table B for each language with the SOURCEID of the English
> term. Table A is manually added. Table B is automatically populated
> and then manually edited.( i.e. the translation of the actual English
> term is added)
>
> > The first error this shows is that using an ID column as the PK buys
> > you ZERO data integrity.
> > The second error this shows is tha application fails to find the
> > related data in table A.
> > yes you have a real mess.

>
> Well not really, because we may want duplicate entries in the table as
> long as they have a valid reason for doing so. This is why this
> problem occurred.
> When the user enters her row she is warned that an entry for this
> already exists and asks her are they sure they want to add the new
> row. This is for the scenario where an English word may have different
> contexts in different languages. I may have given the wrong impression
> of this as i did not clearly explain what i was actually doing, i
> tried to ask a generic question.
>

[]
> > so this was either a bulk load, or manual data entry over a period of
> > time.
> > in either case the process is flawed.

>
> It was a manual data entry over time. Could you tell me why it is
> flawed?


it is flawed because the design allows these duplicates. A better
design would have dealt with this diferently. Not knowing all your
business requirements I won't suggest a solution here (and I don't
have one off the top of my head). Another easy to point out flaw is
the DEPARTMENT attribute in table A. That really should be another
table, or (since dups are allowed in this table) multiple rows.

You did not come here for a design review, but a big part of what I
posted was just that. I apologize if I offended you.
>

[]
> > Solutions:
> > 1. A PL/SQL procedure would be the fastest IF you knew PL/SQL.

>
> I think i will go down this road. I was thinking of using two cursors,
> one to get all entries under 2000, and another to see if there are any
> dupes. If there are, update the rows and then fetch the next row.
>
> > Given you feel less than familiar with SQL, might I ask how you got
> > this assignment? No one else there knows Oracle?

>
> Bingo. When i said i was less than familiar, i meant apart from
> creating tables, triggers, sequences etc. I would have no in depth
> experience of using cursors or anything like that. (which is about to
> change!)


Okay, sounds like a good plan. Any assignment where you expand you
knowledge is good.
>
> > You do know how to use an editor like VI with macro commands like
> > global substitute, right?
> > (If you say no to that, then I pity you since you clearly work for an
> > incompetent manager who assigns people to tasks they cannot pereform.)

>
> Yes, i do know how to that. But i do have an incompetent manager.


Another mea culpa. I assumed you were like some of the posters who can
barely write a SELECT query but get assigned projects like this. I
still think some table redesign is in order, but that doesn't solve
your immediate problem.
>
>
>
> > long term, either change this application or get another job.

>
> Thank you for your help,
> J.


I hope I did. Now it sounds more like you are a good fit there. so
keep the job, and thank you for not taking offense at my less than
complimentary remarks.
Ed

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-08-2008, 11:16 AM
The J Man
 
Posts: n/a
Default Re: Updating data in table depending on data in same table

On Feb 9, 2:10 pm, "Ed Prochak" <edproc...@gmail.com> wrote:
> On Feb 9, 6:21 am, "The J Man" <jonathan.kee...@gmail.com> wrote:
>
>
> it is flawed because the design allows these duplicates. A better
> design would have dealt with this diferently. Not knowing all your
> business requirements I won't suggest a solution here (and I don't
> have one off the top of my head). Another easy to point out flaw is
> the DEPARTMENT attribute in table A. That really should be another
> table, or (since dups are allowed in this table) multiple rows.
>
> You did not come here for a design review, but a big part of what I
> posted was just that. I apologize if I offended you.
>


No, no offense taken. I realize that any design that allows dupes in a
database sucks but the end user (read stupid manager) won out in this
case.

>
> Okay, sounds like a good plan. Any assignment where you expand you
> knowledge is good.
>


Yes it seems to have worked on the test tables i have done out so its
just a matter of running it now on the production database, all the
while hovering over the rollback button!

>
> Another mea culpa. I assumed you were like some of the posters who can
> barely write a SELECT query but get assigned projects like this. I
> still think some table redesign is in order, but that doesn't solve
> your immediate problem.
>


Nah, i realize it probably came off like that because i fluffed around
the facts of what it was actually trying to do. A re design could be
in order as well though.

>
> I hope I did. Now it sounds more like you are a good fit there. so
> keep the job, and thank you for not taking offense at my less than
> complimentary remarks.
> Ed


No problem, thanks again, I'd rather get any replies than none at all.
J.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:04 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com