This is a discussion on Multiple ids in a field within the SQL Server forums, part of the Microsoft SQL Server category; --> I have a book cataloge where I have an author id linked up to the book. However, I'm finding ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have a book cataloge where I have an author id linked up to the book. However, I'm finding that some books have mulitple authors, and I'm wondering if there is a way to add more than one id to the author id field. Or do I have to have many author id fields? Any way around that? Thanks, Bill *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
| |||
| Anith Sen wrote: > GMann@dublin.com asked: I'm confused about the practical aspects of declaring a primary key on > one of the "intersection" tables. (BookAuthors, in this case.) Are there > any practical benefits or is it strictly a philosophical thing? << I'd say that there *is* no benefit of putting a primary key on *one column* of an intersection table. The semantics would enforce that an Author could only write one book, or that a book could have no more than one author, depending on which column you chose as a primary key. For some other application it might be correct, but then you would only need a parent-child relationship, and an intersection table wouldn't be necessary. If you were to define a multi-column primary key, it would prevent mistaken duplicate entries. If you defined the primary key as on (author, book) it would also help queries that wanted all books for a given author. If your application did queries the other way, to find all the authors of a given book, that index would not help, so I would add a second index (not unique) on book. > Declaring a column/set of columns in a table as primary key is definitely > beneficial. In fact, it is quite suicidal not to have one. Except as it enforces unrealistic semantics and destroys the purpose of an intersection table (which is to model a many-to-many relationship, as described above... > >> How does it affect performance? Does it introduce overhead, or does it > speed performance do to indexing? Generally, an appropriate index will help query speed, and will potentially hurt updates and inserts because the index needs to be maintained to reflect data changes. However, to the extent that row changes do not involve indexed columns, the index helps to find the row, and needs no mainenance. > Strictly speaking, having a primary in a table has nothing to do with > performance. Keys are logical concepts and have no performance connotations. But realistically speaking, they do, because they are implemented via a unique index, that definitely provides fast query lookup as well as enforcing unique values. > But, note that SQL Server, by default, implements an index on keys. Indexes > are physical artifacts and have performance implications. However, there is > no proven and consistent statistics which shows that having a key in a table > adversely or favorably affect the performance. You say "by default". Is there a way to have a primary key *without* an index? I don't think so. Otherwise the DBMS would have to do a table scan to verify an insert was OK. Can you show me one DBMS that can enforce primary keys and doesn't create a unique index to do it? Joe Weinstein at BEA |
| |||
| >> Except as it enforces unrealistic semantics and destroys the purpose of an intersection table << The DDL in my post has a two-column primary key on the table BookAuthors. What 'unrealistic' semantics did it enforce? The decomposition I had removes the transitive dependencies of books on authors and gives a clean 3NF projection. In what way did it destroy the 'purpose' of that table? >> ...that definitely provides fast query lookup as well as enforcing unique values. << Not as a rule. Just because you have an index does not mean that the optimizer uses it all the time. The heuristics used by the optimizer is not all documented and the empirical algorithms which generate the execution plan depend on several factors, not solely on an index on the primary key. >> You say "by default". Is there a way to have a primary key ... << Actually it was my mistake, I meant, SQL Server, by default, implements a clustered index on keys. However, the point was that the key vs. index concept is simply a logical vs. physical concept. At the logical level, the user should not be even concerned about what index will be implemented by the DBMS in the physical level. Depending on the extensions provided by the DBMS, you can tune your queries with required indexes, hints, force orders etc. Performance considerations should not corrupt the key selection decisions. In a well-crafted RDBMS, key declaration is a logical modeling aspect which should not be overlapped with indexing, which is a physical implementation matter. Keys are for logical integrity and indexes are for physical performance. However, the fact of the matter being, no DBMS provides or assumes a clean separation of logical and physical levels, it may be worthwhile not to confuse an index with a key and/or vice-versa. -- - Anith ( Please reply to newsgroups only ) |
| |||
| Anith Sen wrote: > >> Except as it enforces unrealistic semantics and destroys the purpose of > an intersection table << > > The DDL in my post has a two-column primary key on the table BookAuthors. > What 'unrealistic' semantics did it enforce? None! I was describing the unrealism/destruction caused by defining a primary key on *one* column of an interscection table. > The decomposition I had removes > the transitive dependencies of books on authors and gives a clean 3NF > projection. In what way did it destroy the 'purpose' of that table? Again none. Sorry if I was unclear. I said that a one-column primary key would destroy the purpose of an intersection table. > >> ...that definitely provides fast query lookup as well as enforcing unique > values. << > > Not as a rule. Just because you have an index does not mean that the > optimizer uses it all the time. The heuristics used by the optimizer is not > all documented and the empirical algorithms which generate the execution > plan depend on several factors, not solely on an index on the primary key. Here I'd disagree with you. As a rule, the unique index does help with queries that suit a unique index, such as one-row searches and most small-range queries on the key. If/when the query is unsuitable for a unique index, or when the undocumented query engine is broken, an index may/will not be used, but you could say the same for the DBMS itself. What is an "empirical algorithm"? > >> You say "by default". Is there a way to have a primary key ... << > > Actually it was my mistake, I meant, SQL Server, by default, implements a > clustered index on keys. Ok, I think I understand. That means that the 'default DBMS' for this discussion is MS SQLServer. That would imply that you have at least one alternative DBMS to the default, that does not implement a primary key constraint with a unique index. Can you name it? > At the logical level, the user should not be even > concerned about what index will be implemented by the DBMS in the physical > level. I agree. > Depending on the extensions provided by the DBMS, you can tune your > queries with required indexes, hints, force orders etc. Performance > considerations should not corrupt the key selection decisions. Ok, but they may certainly conflict if you want one sort of index for performance, but it conflicts with the index the DBMS will make if you define a primary key as such. I'm still jumping up and down on my limb of claiming that all DBMSes create a unique index if you specifiy a primary key, so in the real world you need to know about both, soon in the design process. > In a > well-crafted RDBMS, key declaration is a logical modeling aspect which > should not be overlapped with indexing, which is a physical implementation > matter. I would happily take the fall from that limb I'm jumping on, if you can name one well-crafted RDBMS as you define it. > Keys are for logical integrity and indexes are for physical > performance. That's true, except for the fact that all DBMSes use indexes to implement logical consistency constraints (at least unique keys). > However, the fact of the matter being, no DBMS provides or > assumes a clean separation of logical and physical levels, it may be > worthwhile not to confuse an index with a key and/or vice-versa. Sure. They aren't the same thing. A key need not be enforced by anything except application code, and an index can be on any column or columns in any order, independent of operational value. I'm just thinking that it would be confusing not to realize the ramifications of defining a primary key in SQL: a unique index. (This is only true for MS, Sybase, Informix, DB2, Oracle, Ingres, Postgres, Pointbase, Cloudscape, times10, mySQL, Tandem, so we should definitely document the others that don't) Joe > > -- > - Anith > ( Please reply to newsgroups only ) |
| |||
| >> I was describing the unrealism/destruction caused by defining a primary key on *one* column of an interscection table. << I understand now. In that case you wouldn't even need an intersection table for a 1-to-many relationship. All you need is an author_id column in your Books table as a foreign key to the Author table. >> As a rule, the unique index does help with queries that suit a unique index, such as one-row searches and most small-range queries on the key. << Not necessarily all the time, an example in SQL Server is to use a cross-join with a derived table of numbers with a multi-table join where the key column of the table is non-clustered. For examples search the archives of msnews.public.programming. There are instances when the optimizer decides to scan the table, even when there is an explicit index on the sarg. The reason for this is the cost-efficiency of scanning a table (depending on many factors including the magnitude of data) in certain cases may outperform the usage of an index. Another common phenomenon is the optimizer disregarding the index due to recompilations, outdated statistics etc. And FYI http://dictionary.reference.com/search?q=empirical >> That means that the 'default DBMS' for this discussion is MS SQLServer. That would imply that you have at least one alternative DBMS to the default, that does not implement a primary key constraint with a unique index. << No, I think you misinterpreted; let me clarify. I meant when you declare a key in a SQL Server table, the created index by default is a clustered index. For example, 1) CREATE TABLE tbl (col1 INT NOT NULL PRIMARY KEY) 2) CREATE TABLE tbl (col1 INT NOT NULL) GO ALTER TABLE tbl ADD CONSTRAINT PK_tbl PRIMARY KEY (col1) Alternatively, you can override this default making it a non-clustered index by explicitly using ALTER TABLE statement. I had no other DBMSs in my mind while stating it. -- - Anith ( Please reply to newsgroups only ) |
| ||||
| Guinness Mann (GMann@dublin.com) writes: > I'm confused about the practical aspects of declaring a primary key on > one of the "intersection" tables. (BookAuthors, in this case.) Are > there any practical benefits or is it strictly a philosophical thing? Yes, there are practical benefits. What's the use with having multiple entries that Isaac Asimov wrote "Foundation"? > How does it affect performance? Does it introduce overhead, or does it > speed performance do to indexing? There is always a trade-off with indexes. Indexes can speed up queries to access the data enormously, but there is a little price for maintaing the index at insert, updates and deletes. For tabels like Bookauthors, I often find myself defining two indexes. One (Book, Author) and one (Author, Book), as access often are in both directions. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |