This is a discussion on Design approaches about primary key within the Oracle Database forums, part of the Database Server Software category; --> Dear all, This is a general design question. Over the years, I've been creating an oracle database storing financial ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Dear all, This is a general design question. Over the years, I've been creating an oracle database storing financial data; about 60 tables. I consequently designed all tables this way: - Primary key ALWAYS is an integer, generated by a sequence (separate sequences for each table) - The "logic key" (e.g. a company Product ID) ist stored in an extra field, combined with an unique index - All reference constraints are always made upon the primary integer keys. So far I think this is a common approach. The main advantage is you can change the "logic key", without the need to do any changes in detail tables. However, if you look into the detail tables, you see a bunch of integer numbers which are completely meaniningless unless you join them with their master tables. Also if you want to insert or retrieve data, you always have to lookup or to join with other tables. I currently ask myself if using the integer primary keys is really worth the effort. So what about replacing the integer keys by "meaningful" keys and also storing these keys in the detail tables? I know, if a key needs to be changed, then detail tables need to be changed, too. But on the other side a lot of trouble when handling table data will be avoided. What's your opinion about this topic? Do you have any experiences with databases in these two design apporaches? Best regards Georg Scholz www.scholz-informatik.at |
| |||
| Georg Scholz wrote: > Dear all, > > This is a general design question. > > Over the years, I've been creating an oracle database storing financial > data; about 60 tables. > I consequently designed all tables this way: > > - Primary key ALWAYS is an integer, generated by a sequence (separate > sequences for each table) > - The "logic key" (e.g. a company Product ID) ist stored in an extra field, > combined with an unique index > - All reference constraints are always made upon the primary integer keys. > > So far I think this is a common approach. The main advantage is you can > change the "logic key", without the need to do any changes in detail tables. > > However, if you look into the detail tables, you see a bunch of integer > numbers which are completely meaniningless unless you join them with their > master tables. > Also if you want to insert or retrieve data, you always have to lookup or to > join with other tables. > > I currently ask myself if using the integer primary keys is really worth the > effort. > So what about replacing the integer keys by "meaningful" keys and also > storing these keys in the detail tables? > I know, if a key needs to be changed, then detail tables need to be > changed, too. > But on the other side a lot of trouble when handling table data will be > avoided. > > What's your opinion about this topic? > Do you have any experiences with databases in these two design apporaches? > > Best regards > > Georg Scholz > www.scholz-informatik.at I believe, as does Joe Celko, that there are few things worse than the "mindless" creation of surrogate keys (or as he calls them artificial keys). There are times when they are the best and most appropriate solution. But those situations are not as common as many believe. And the price paid includes extra constraints and extra indexes that serve no real purpose. While this link is in relationship to SQL Server it is worth reading: http://groups.google.com/groups?q=%2...le.com&rnum=11 (reconstruct if the long link breaks into multiple lines) I think what you are saying is that you have developed a habit ... and a habit that does not truly relate to the data. That would make it a bad habit and I'd suggest you break it. -- Daniel Morgan http://www.outreach.washington.edu/e...ad/oad_crs.asp http://www.outreach.washington.edu/e...oa/aoa_crs.asp damorgan@x.washington.edu (replace 'x' with a 'u' to reply) |
| |||
| Hi Georg, Artificial ("meaningless") keys have their place and should be used in implementation when appropriate, but certainly not unconsciously. General rules of thumb : 1) When the "real" key is not a composite key, artificial keys are rarely (but not always) justified. 2) In a stand-alone table (one with no children and parent tables), artificial keys should not be used. 3) The more fields the real key spans over (i.e the more "composite" it is), the more meaningful an artificial key will be. 4) The more often a composite key will appear in children tables and even grand-children tables, the more meaningful an artificial key will be. It obviously costs less space to propagate an integer artificial key down in the children/grandchildren/grand-grand children tables. Quries become easier too. I think a complete and correct datamodel is essential for you to make the right implementation choices regarding artificial keys. I've seen a lot of systems suffering from awkward composite keys appearing everywhere, mainly because those system had been growing over time, but I've never (before) seen a system suffering from too many artificial keys as yours do. - Kenneth Koenraadt On Thu, 27 Nov 2003 11:23:19 +0100, "Georg Scholz" <georg.scholz@vienna.at> wrote: >Dear all, > >This is a general design question. > >Over the years, I've been creating an oracle database storing financial >data; about 60 tables. >I consequently designed all tables this way: > >- Primary key ALWAYS is an integer, generated by a sequence (separate >sequences for each table) >- The "logic key" (e.g. a company Product ID) ist stored in an extra field, >combined with an unique index >- All reference constraints are always made upon the primary integer keys. > >So far I think this is a common approach. The main advantage is you can >change the "logic key", without the need to do any changes in detail tables. > >However, if you look into the detail tables, you see a bunch of integer >numbers which are completely meaniningless unless you join them with their >master tables. >Also if you want to insert or retrieve data, you always have to lookup or to >join with other tables. > >I currently ask myself if using the integer primary keys is really worth the >effort. >So what about replacing the integer keys by "meaningful" keys and also >storing these keys in the detail tables? >I know, if a key needs to be changed, then detail tables need to be >changed, too. >But on the other side a lot of trouble when handling table data will be >avoided. > >What's your opinion about this topic? >Do you have any experiences with databases in these two design apporaches? > >Best regards > >Georg Scholz >www.scholz-informatik.at > > |
| |||
| Using meaningless keys when meaningful ones exist is stupid because you create a condition that always requires a join to pick up the value that everyone understands. Bad design -- it comes from people copying textbook examples and using them in real life. My social security number is a key, so is my phone number (for a telco) -- they don't assign a sequential number on top of that. Use an integer when their is no key, or when the composite key would be way way too wide. Just my 2 cents .../Dom. "Georg Scholz" <georg.scholz@vienna.at> wrote in message news:<3fc5d07c$0$19266$91cee783@newsreader01.highw ay.telekom.at>... > Dear all, > > This is a general design question. > > Over the years, I've been creating an oracle database storing financial > data; about 60 tables. > I consequently designed all tables this way: > > - Primary key ALWAYS is an integer, generated by a sequence (separate > sequences for each table) > - The "logic key" (e.g. a company Product ID) ist stored in an extra field, > combined with an unique index > - All reference constraints are always made upon the primary integer keys. > > So far I think this is a common approach. The main advantage is you can > change the "logic key", without the need to do any changes in detail tables. > > However, if you look into the detail tables, you see a bunch of integer > numbers which are completely meaniningless unless you join them with their > master tables. > Also if you want to insert or retrieve data, you always have to lookup or to > join with other tables. > > I currently ask myself if using the integer primary keys is really worth the > effort. > So what about replacing the integer keys by "meaningful" keys and also > storing these keys in the detail tables? > I know, if a key needs to be changed, then detail tables need to be > changed, too. > But on the other side a lot of trouble when handling table data will be > avoided. > > What's your opinion about this topic? > Do you have any experiences with databases in these two design apporaches? > > Best regards > > Georg Scholz > www.scholz-informatik.at |
| |||
| Domenic G. wrote: > Using meaningless keys when meaningful ones exist is stupid because > you create a condition that always requires a join to pick up the > value that everyone understands. Bad design -- it comes from people > copying textbook examples and using them in real life. > > My social security number is a key, so is my phone number (for a > telco) -- they don't assign a sequential number on top of that. Use > an integer when their is no key, or when the composite key would be > way way too wide. > > Just my 2 cents .../Dom. > > > "Georg Scholz" <georg.scholz@vienna.at> wrote in message news:<3fc5d07c$0$19266$91cee783@newsreader01.highw ay.telekom.at>... > >>Dear all, >> >>This is a general design question. >> >>Over the years, I've been creating an oracle database storing financial >>data; about 60 tables. >>I consequently designed all tables this way: >> >>- Primary key ALWAYS is an integer, generated by a sequence (separate >>sequences for each table) >>- The "logic key" (e.g. a company Product ID) ist stored in an extra field, >>combined with an unique index >>- All reference constraints are always made upon the primary integer keys. >> >>So far I think this is a common approach. The main advantage is you can >>change the "logic key", without the need to do any changes in detail tables. >> >>However, if you look into the detail tables, you see a bunch of integer >>numbers which are completely meaniningless unless you join them with their >>master tables. >>Also if you want to insert or retrieve data, you always have to lookup or to >>join with other tables. >> >>I currently ask myself if using the integer primary keys is really worth the >>effort. >>So what about replacing the integer keys by "meaningful" keys and also >>storing these keys in the detail tables? >>I know, if a key needs to be changed, then detail tables need to be >>changed, too. >>But on the other side a lot of trouble when handling table data will be >>avoided. >> >>What's your opinion about this topic? >>Do you have any experiences with databases in these two design apporaches? >> >>Best regards >> >>Georg Scholz >>www.scholz-informatik.at Actually your telco assigns a natural key too. They call it an account number. And the reason they do is because different people, over time, have the same phone number. One more little bit of knowledge gained from working for Ma Bell. -- Daniel Morgan http://www.outreach.washington.edu/e...ad/oad_crs.asp http://www.outreach.washington.edu/e...oa/aoa_crs.asp damorgan@x.washington.edu (replace 'x' with a 'u' to reply) |
| |||
| On 27 Nov 2003, domenicg@hotmail.com wrote: > Using meaningless keys when meaningful ones exist is stupid > because you create a condition that always requires a join to > pick up the value that everyone understands. Basing one's keys on the whims of someone else's data storage is dangerous. > Bad design -- it comes from people copying textbook examples > and using them in real life. No, it comes from people creating a database from scratch. Remember, a meaningful key was, on its conception, a meaningless key to someone else. > My social security number is a key, It isn't guaranteed to be unique, and it certainly had no meaning on its creation for you. > so is my phone number (for a telco) So, what happens when I ask them to change my number to an unlisted one? > -- they don't assign a sequential number on top of that. I don't know how telco's design their database's but I'd bet they all look a bit different. -- Galen Boyer |
| |||
| Hi Georg: As you can see from these responses the answer is "it all depends". A couple of points about meaningful primary keys. 1. Can you guarantee the meaning will not change over time? Take a company with fifty offices and an offices table. If office_id is the primary key and you assign STL as the value for the StLouis office, what happens when that office is moved to Milwaukee? The "meaningful" key now is misleading at best. 2. Sometimes you just can't find a meaningful key. Consider a table of persons. What fields, known to you, could you possibly pick to uniquely identify a person? Not surname, not surname + firstname + middle initial. Not social security number or DNA since those are usually unavailable to you. 3. Sometimes you have a problem whether you choose meaningful or meaningless keys. If you had a table of countries a few years back, no matter whether the primary key was a sequence or the United Nations country code, when the Soviet Union split into multiple countries you had a problem. Van "Georg Scholz" <georg.scholz@vienna.at> wrote in message news:3fc5d07c$0$19266$91cee783@newsreader01.highwa y.telekom.at... > Dear all, > > This is a general design question. > > Over the years, I've been creating an oracle database storing financial > data; about 60 tables. > I consequently designed all tables this way: > > - Primary key ALWAYS is an integer, generated by a sequence (separate > sequences for each table) > - The "logic key" (e.g. a company Product ID) ist stored in an extra field, > combined with an unique index > - All reference constraints are always made upon the primary integer keys. > > So far I think this is a common approach. The main advantage is you can > change the "logic key", without the need to do any changes in detail tables. > > However, if you look into the detail tables, you see a bunch of integer > numbers which are completely meaniningless unless you join them with their > master tables. > Also if you want to insert or retrieve data, you always have to lookup or to > join with other tables. > > I currently ask myself if using the integer primary keys is really worth the > effort. > So what about replacing the integer keys by "meaningful" keys and also > storing these keys in the detail tables? > I know, if a key needs to be changed, then detail tables need to be > changed, too. > But on the other side a lot of trouble when handling table data will be > avoided. > > What's your opinion about this topic? > Do you have any experiences with databases in these two design apporaches? > > Best regards > > Georg Scholz > www.scholz-informatik.at > > |
| |||
| Galen Boyer wrote: > On 27 Nov 2003, domenicg@hotmail.com wrote: > > >>Using meaningless keys when meaningful ones exist is stupid >>because you create a condition that always requires a join to >>pick up the value that everyone understands. > > > Basing one's keys on the whims of someone else's data storage is > dangerous. > > >>Bad design -- it comes from people copying textbook examples >>and using them in real life. > > > No, it comes from people creating a database from scratch. > Remember, a meaningful key was, on its conception, a meaningless > key to someone else. > > >>My social security number is a key, > > > It isn't guaranteed to be unique, and it certainly had no meaning > on its creation for you. > > >>so is my phone number (for a telco) > > > So, what happens when I ask them to change my number to an > unlisted one? > > >>-- they don't assign a sequential number on top of that. > > > I don't know how telco's design their database's but I'd bet they > all look a bit different. Your post contains things with which I agree and those with which I disagree. You are correct about Telcos as I pointed out. And also corect about social security numbers. And I can't argue with a statement that uses the word 'whim'. But you are incorrect about 'textbook examples'. Too many databases are built based on some textbook example or cannabilizing a design from some other system rather than a serious analysis of the business and data integrity implications of the specific case. -- Daniel Morgan http://www.outreach.washington.edu/e...ad/oad_crs.asp http://www.outreach.washington.edu/e...oa/aoa_crs.asp damorgan@x.washington.edu (replace 'x' with a 'u' to reply) |
| |||
| Comment in-line. Van Messner wrote: > Hi Georg: > > As you can see from these responses the answer is "it all depends". A > couple of points about meaningful primary keys. > > 1. Can you guarantee the meaning will not change over time? Take a > company with fifty offices and an offices table. If office_id is the > primary key and you assign STL as the value for the StLouis office, what > happens when that office is moved to Milwaukee? The "meaningful" key now is > misleading at best. This is a poor example. If the entire office moves intact the key stays the same. If the business closes one office but opens a different one and wishes to retain the original records intact and separate from the new ones they just create a new PK and reassign the personnel and assets. The issue is not the PK value ... the issue is how management wishes to have the operational change reported in the future. It isn't a technology decision ... it is a management decision. > 2. Sometimes you just can't find a meaningful key. Consider a table of > persons. What fields, known to you, could you possibly pick to uniquely > identify a person? Not surname, not surname + firstname + middle initial. > Not social security number or DNA since those are usually unavailable to > you. I'd disagree on SSN but it is not guaranteed to be unique and of zero value to an international organization. In the case you describe the only valid solution is a surrogate "PERSON_ID" value. > 3. Sometimes you have a problem whether you choose meaningful or > meaningless keys. If you had a table of countries a few years back, no > matter whether the primary key was a sequence or the United Nations country > code, when the Soviet Union split into multiple countries you had a problem. Not at all. It was easily handled. Records related to the USSR stopped having inserts and updates and new records were created for the new entities. This is no different from a wholesale-distribution environment where some customers go out of business and new customers are developed. There is no technological or business reason to treat them differently. -- Daniel Morgan http://www.outreach.washington.edu/e...ad/oad_crs.asp http://www.outreach.washington.edu/e...oa/aoa_crs.asp damorgan@x.washington.edu (replace 'x' with a 'u' to reply) |
| ||||
| "Daniel Morgan" <damorgan@x.washington.edu> wrote in message news:1070043650.390694@yasure... > Comment in-line. > > Van Messner wrote: > > > Hi Georg: > > > > As you can see from these responses the answer is "it all depends". A > > couple of points about meaningful primary keys. > > > > 1. Can you guarantee the meaning will not change over time? Take a > > company with fifty offices and an offices table. If office_id is the > > primary key and you assign STL as the value for the StLouis office, what > > happens when that office is moved to Milwaukee? The "meaningful" key now is > > misleading at best. > > This is a poor example. If the entire office moves intact the key stays > the same. If the business closes one office but opens a different one > and wishes to retain the original records intact and separate from the > new ones they just create a new PK and reassign the personnel and assets. > > The issue is not the PK value ... the issue is how management wishes to > have the operational change reported in the future. It isn't a > technology decision ... it is a management decision. A matter of design and method I suppose. You don't like public synonyms, I don't like to change primary keys. > > > 2. Sometimes you just can't find a meaningful key. Consider a table of > > persons. What fields, known to you, could you possibly pick to uniquely > > identify a person? Not surname, not surname + firstname + middle initial. > > Not social security number or DNA since those are usually unavailable to > > you. > > I'd disagree on SSN but it is not guaranteed to be unique and of zero > value to an international organization. In the case you describe the > only valid solution is a surrogate "PERSON_ID" value. That's exactly the point. As I said "sometimes you just can't find a meaningful key" . > > > 3. Sometimes you have a problem whether you choose meaningful or > > meaningless keys. If you had a table of countries a few years back, no > > matter whether the primary key was a sequence or the United Nations country > > code, when the Soviet Union split into multiple countries you had a problem. > > Not at all. It was easily handled. Records related to the USSR stopped > having inserts and updates and new records were created for the new > entities. > > This is no different from a wholesale-distribution environment where > some customers go out of business and new customers are developed. There > is no technological or business reason to treat them differently. > > -- > Daniel Morgan > http://www.outreach.washington.edu/e...ad/oad_crs.asp > http://www.outreach.washington.edu/e...oa/aoa_crs.asp > damorgan@x.washington.edu > (replace 'x' with a 'u' to reply) > |