vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I've been doing a bit of reading and have read in quite a few places that an identity column is a good clustered index and that all or at least most tables should have a clustered index. The tool I used to generate tables made them all with non clustered indexes so I would like to drop all of them and generate clustered indexes. So my questions is a) good idea? and b) how? There are foreign key references to most of them so those would need to be dropped first and then re-created after the clustered one was created and that could cascade (I think?) Any existing scripts out there that might do this? I found something similar and modified it, the sql is included below. This gives me the list of all the columns I need, I just need to get the foreign keys for each from here before each one and generate all the create/drop scripts. All the columns I am looking to do this for are called "Id" making this somewhat simpler. I'm just looking to incrementally make the SQL side better and don't want to rewrite a bunch of application level code to make the column names ISO compliant, etc. /* -- Returns whether the column is ASC or DESC CREATE FUNCTION dbo.GetIndexColumnOrder ( @object_id INT, @index_id TINYINT, @column_id TINYINT ) RETURNS NVARCHAR(5) AS BEGIN DECLARE @r NVARCHAR(5) SELECT @r = CASE INDEXKEY_PROPERTY ( @object_id, @index_id, @column_id, 'IsDescending' ) WHEN 1 THEN N' DESC' ELSE N'' END RETURN @r END -- Returns the list of columns in the index CREATE FUNCTION dbo.GetIndexColumns ( @table_name SYSNAME, @object_id INT, @index_id TINYINT ) RETURNS NVARCHAR(4000) AS BEGIN DECLARE @colnames NVARCHAR(4000), @thisColID INT, @thisColName SYSNAME SET @colnames = INDEX_COL(@table_name, @index_id, 1) + dbo.GetIndexColumnOrder(@object_id, @index_id, 1) SET @thisColID = 2 SET @thisColName = INDEX_COL(@table_name, @index_id, @thisColID) + dbo.GetIndexColumnOrder(@object_id, @index_id, @thisColID) WHILE (@thisColName IS NOT NULL) BEGIN SET @thisColID = @thisColID + 1 SET @colnames = @colnames + ', ' + @thisColName SET @thisColName = INDEX_COL(@table_name, @index_id, @thisColID) + dbo.GetIndexColumnOrder(@object_id, @index_id, @thisColID) END RETURN @colNames END CREATE VIEW dbo.vAllIndexes AS begin SELECT TABLE_NAME = OBJECT_NAME(i.id), INDEX_NAME = i.name, COLUMN_LIST = dbo.GetIndexColumns(OBJECT_NAME(i.id), i.id, i.indid), IS_CLUSTERED = INDEXPROPERTY(i.id, i.name, 'IsClustered'), IS_UNIQUE = INDEXPROPERTY(i.id, i.name, 'IsUnique'), FILE_GROUP = g.GroupName FROM sysindexes i INNER JOIN sysfilegroups g ON i.groupid = g.groupid WHERE (i.indid BETWEEN 1 AND 254) -- leave out AUTO_STATISTICS: AND (i.Status & 64)=0 -- leave out system tables: AND OBJECTPROPERTY(i.id, 'IsMsShipped') = 0 end */ SELECT v.* FROM dbo.vAllIndexes v INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS T ON T.CONSTRAINT_NAME = v.INDEX_NAME AND T.TABLE_NAME = v.TABLE_NAME AND T.CONSTRAINT_TYPE = 'PRIMARY KEY' AND v.COLUMN_LIST = 'Id' AND v.IS_CLUSTERED = 0 ORDER BY v.TABLE_NAME |
| |||
| It's OK to have a clustered index that is seperate from your nonclustered primary key, even if the two indexes cover the same columns. In fact, I usually build my indexes in this way in case I ever have to move the clustered index to a different column and I don't want to mess with my established foreign key constraints. That being said, I would simply add the clustered index to each table and not worry about dropping the pre-existing primary key constraint. It'll take a while, but it will work. Stu pb648174 wrote: > I've been doing a bit of reading and have read in quite a few places > that an identity column is a good clustered index and that all or at > least most tables should have a clustered index. The tool I used to > generate tables made them all with non clustered indexes so I would > like to drop all of them and generate clustered indexes. So my > questions is a) good idea? and b) how? There are foreign key references > to most of them so those would need to be dropped first and then > re-created after the clustered one was created and that could cascade > (I think?) > > Any existing scripts out there that might do this? I found something > similar and modified it, the sql is included below. This gives me the > list of all the columns I need, I just need to get the foreign keys for > each from here before each one and generate all the create/drop > scripts. > > All the columns I am looking to do this for are called "Id" making this > somewhat simpler. I'm just looking to incrementally make the SQL side > better and don't want to rewrite a bunch of application level code to > make the column names ISO compliant, etc. > > /* > -- Returns whether the column is ASC or DESC > CREATE FUNCTION dbo.GetIndexColumnOrder > ( > @object_id INT, > @index_id TINYINT, > @column_id TINYINT > ) > RETURNS NVARCHAR(5) > AS > BEGIN > DECLARE @r NVARCHAR(5) > SELECT @r = CASE INDEXKEY_PROPERTY > ( > @object_id, > @index_id, > @column_id, > 'IsDescending' > ) > WHEN 1 THEN N' DESC' > ELSE N'' > END > RETURN @r > END > > -- Returns the list of columns in the index > CREATE FUNCTION dbo.GetIndexColumns > ( > @table_name SYSNAME, > @object_id INT, > @index_id TINYINT > ) > RETURNS NVARCHAR(4000) > AS > BEGIN > DECLARE > @colnames NVARCHAR(4000), > @thisColID INT, > @thisColName SYSNAME > > SET @colnames = INDEX_COL(@table_name, @index_id, 1) > + dbo.GetIndexColumnOrder(@object_id, @index_id, 1) > > SET @thisColID = 2 > SET @thisColName = INDEX_COL(@table_name, @index_id, @thisColID) > + dbo.GetIndexColumnOrder(@object_id, @index_id, @thisColID) > > WHILE (@thisColName IS NOT NULL) > BEGIN > SET @thisColID = @thisColID + 1 > SET @colnames = @colnames + ', ' + @thisColName > > SET @thisColName = INDEX_COL(@table_name, @index_id, > @thisColID) > + dbo.GetIndexColumnOrder(@object_id, @index_id, > @thisColID) > END > RETURN @colNames > END > > CREATE VIEW dbo.vAllIndexes > AS > begin > SELECT > TABLE_NAME = OBJECT_NAME(i.id), > INDEX_NAME = i.name, > COLUMN_LIST = dbo.GetIndexColumns(OBJECT_NAME(i.id), i.id, > i.indid), > IS_CLUSTERED = INDEXPROPERTY(i.id, i.name, 'IsClustered'), > IS_UNIQUE = INDEXPROPERTY(i.id, i.name, 'IsUnique'), > FILE_GROUP = g.GroupName > FROM > sysindexes i > INNER JOIN > sysfilegroups g > ON > i.groupid = g.groupid > WHERE > (i.indid BETWEEN 1 AND 254) > -- leave out AUTO_STATISTICS: > AND (i.Status & 64)=0 > -- leave out system tables: > AND OBJECTPROPERTY(i.id, 'IsMsShipped') = 0 > end > */ > > SELECT > v.* > FROM > dbo.vAllIndexes v > INNER JOIN > INFORMATION_SCHEMA.TABLE_CONSTRAINTS T > ON > T.CONSTRAINT_NAME = v.INDEX_NAME > AND T.TABLE_NAME = v.TABLE_NAME > AND T.CONSTRAINT_TYPE = 'PRIMARY KEY' > AND v.COLUMN_LIST = 'Id' > AND v.IS_CLUSTERED = 0 > ORDER BY v.TABLE_NAME |
| |||
| pb648174 (google@webpaul.net) writes: > I've been doing a bit of reading and have read in quite a few places > that an identity column is a good clustered index and that all or at > least most tables should have a clustered index. The tool I used to > generate tables made them all with non clustered indexes so I would > like to drop all of them and generate clustered indexes. Yes, having clustered indexes on all tables is a good idea, but the IDENTITY column is not always the best choice. It's a good choice if you have a high transaction rate, and you want to avoid fragmentation and page splits. But for SELECT queries it is likely that in most tables that there are better candidates for the clustered index, as you don't do range queries on ids that often. So I would suggest that you review your tables and look for better columns to cluster on. Here I had single-column PKs in mind. Clustering on a multi-column PK, or part of it is another matter. Take an OrderDetails table for instance. "SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely query and a clustred index may be great here. Stu's suggestion of keeping the PK non-clustered, and adding a clustered index as well is not that bad. If you have a multi-column key that is 4 30 bytes long, but the first key column is four bytes, the clustering on the first columns means that the key size for the clustered index is only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear in non-clustered index, this matters quite a bit. As for looking up the foreign keys, the tables are sysreferences in SQL 2000 and sys.forein_keys in SQL 2005. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Well that makes things simpler then.. I'll try adding the clustered columns to one area of the app and see if it makes a positive or negative performance impact. Thanks for the info guys. Erland Sommarskog wrote: > pb648174 (google@webpaul.net) writes: > > I've been doing a bit of reading and have read in quite a few places > > that an identity column is a good clustered index and that all or at > > least most tables should have a clustered index. The tool I used to > > generate tables made them all with non clustered indexes so I would > > like to drop all of them and generate clustered indexes. > > Yes, having clustered indexes on all tables is a good idea, but the > IDENTITY column is not always the best choice. It's a good choice if > you have a high transaction rate, and you want to avoid fragmentation > and page splits. > > But for SELECT queries it is likely that in most tables that there > are better candidates for the clustered index, as you don't do > range queries on ids that often. So I would suggest that you review > your tables and look for better columns to cluster on. > > Here I had single-column PKs in mind. Clustering on a multi-column PK, > or part of it is another matter. Take an OrderDetails table for instance. > "SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely > query and a clustred index may be great here. > > Stu's suggestion of keeping the PK non-clustered, and adding a clustered > index as well is not that bad. If you have a multi-column key that is 4 > 30 bytes long, but the first key column is four bytes, the clustering on > the first columns means that the key size for the clustered index is > only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear > in non-clustered index, this matters quite a bit. > > As for looking up the foreign keys, the tables are sysreferences in > SQL 2000 and sys.forein_keys in SQL 2005. > > > > -- > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/pro...ads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Performance was actually worse once I added the clustered index. A query that takes 4 seconds took 5 seconds after adding clustered indexes to all the tables for a particular module. I turned the actual execution plan display on and saw that it was using the clustered index instead of the non clustered. So without the clustered index the largest time used is an "index seek" and a "table spool/lazy spool" and with the clustered index the index seek just becomes a clustered index seek... No big difference except it takes longer! pb648174 wrote: > Well that makes things simpler then.. I'll try adding the clustered > columns to one area of the app and see if it makes a positive or > negative performance impact. Thanks for the info guys. > > Erland Sommarskog wrote: > > pb648174 (google@webpaul.net) writes: > > > I've been doing a bit of reading and have read in quite a few places > > > that an identity column is a good clustered index and that all or at > > > least most tables should have a clustered index. The tool I used to > > > generate tables made them all with non clustered indexes so I would > > > like to drop all of them and generate clustered indexes. > > > > Yes, having clustered indexes on all tables is a good idea, but the > > IDENTITY column is not always the best choice. It's a good choice if > > you have a high transaction rate, and you want to avoid fragmentation > > and page splits. > > > > But for SELECT queries it is likely that in most tables that there > > are better candidates for the clustered index, as you don't do > > range queries on ids that often. So I would suggest that you review > > your tables and look for better columns to cluster on. > > > > Here I had single-column PKs in mind. Clustering on a multi-column PK, > > or part of it is another matter. Take an OrderDetails table for instance. > > "SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely > > query and a clustred index may be great here. > > > > Stu's suggestion of keeping the PK non-clustered, and adding a clustered > > index as well is not that bad. If you have a multi-column key that is 4 > > 30 bytes long, but the first key column is four bytes, the clustering on > > the first columns means that the key size for the clustered index is > > only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear > > in non-clustered index, this matters quite a bit. > > > > As for looking up the foreign keys, the tables are sysreferences in > > SQL 2000 and sys.forein_keys in SQL 2005. > > > > > > > > -- > > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se > > > > Books Online for SQL Server 2005 at > > http://www.microsoft.com/technet/pro...ads/books.mspx > > Books Online for SQL Server 2000 at > > http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| ||||
| Index tuning is not black-and-white, especially when it comes to the clustered index decision. It is likely than some queries will benefit by the PK clustered index while others will not. You'll need run a mix of queries that is representative of the actual workload mix to ascertain overall performance impact. IMHO, an all-or-nothing clustered index decision is naive. It is also possible that some tables will benefit with the clustered PK and others will not. I know that this adds a wrinkle to automated schema generation but this is reality. You might consider using the Index Tuning Wizard (SQL 2000) or Database Engine Tuning Advisor (SQL 2005) for index recommendations based on workload. -- Hope this helps. Dan Guzman SQL Server MVP "pb648174" <google@webpaul.net> wrote in message news:1155727057.245342.208480@74g2000cwt.googlegro ups.com... > Performance was actually worse once I added the clustered index. A > query that takes 4 seconds took 5 seconds after adding clustered > indexes to all the tables for a particular module. I turned the actual > execution plan display on and saw that it was using the clustered index > instead of the non clustered. So without the clustered index the > largest time used is an "index seek" and a "table spool/lazy spool" and > with the clustered index the index seek just becomes a clustered index > seek... No big difference except it takes longer! > > pb648174 wrote: >> Well that makes things simpler then.. I'll try adding the clustered >> columns to one area of the app and see if it makes a positive or >> negative performance impact. Thanks for the info guys. >> >> Erland Sommarskog wrote: >> > pb648174 (google@webpaul.net) writes: >> > > I've been doing a bit of reading and have read in quite a few places >> > > that an identity column is a good clustered index and that all or at >> > > least most tables should have a clustered index. The tool I used to >> > > generate tables made them all with non clustered indexes so I would >> > > like to drop all of them and generate clustered indexes. >> > >> > Yes, having clustered indexes on all tables is a good idea, but the >> > IDENTITY column is not always the best choice. It's a good choice if >> > you have a high transaction rate, and you want to avoid fragmentation >> > and page splits. >> > >> > But for SELECT queries it is likely that in most tables that there >> > are better candidates for the clustered index, as you don't do >> > range queries on ids that often. So I would suggest that you review >> > your tables and look for better columns to cluster on. >> > >> > Here I had single-column PKs in mind. Clustering on a multi-column PK, >> > or part of it is another matter. Take an OrderDetails table for >> > instance. >> > "SELECT ... FROM OrderDetails WHERE OrderID = @id" is a very likely >> > query and a clustred index may be great here. >> > >> > Stu's suggestion of keeping the PK non-clustered, and adding a >> > clustered >> > index as well is not that bad. If you have a multi-column key that is 4 >> > 30 bytes long, but the first key column is four bytes, the clustering >> > on >> > the first columns means that the key size for the clustered index is >> > only 8 bytes. (key col + uniquifier). Since cluster-key colunms appear >> > in non-clustered index, this matters quite a bit. >> > >> > As for looking up the foreign keys, the tables are sysreferences in >> > SQL 2000 and sys.forein_keys in SQL 2005. >> > >> > >> > >> > -- >> > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se >> > >> > Books Online for SQL Server 2005 at >> > http://www.microsoft.com/technet/pro...ads/books.mspx >> > Books Online for SQL Server 2000 at >> > http://www.microsoft.com/sql/prodinf...ons/books.mspx > |