This is a discussion on select * in views within the SQL Server forums, part of the Microsoft SQL Server category; --> Peter (someone@someplace.com) writes: >> CREATE TABLE dbo.t1 (x INT NOT NULL PRIMARY KEY, z1 INT NULL); >> GO >> ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Peter (someone@someplace.com) writes: >> CREATE TABLE dbo.t1 (x INT NOT NULL PRIMARY KEY, z1 INT NULL); >> GO >> CREATE VIEW dbo.v1 AS >> SELECT * FROM dbo.t1 >> GO >> ALTER TABLE dbo.t1 DROP COLUMN z1 ; >> ALTER TABLE dbo.t1 ADD z2 INT ; >> GO >> EXEC dbo.sp_recompile 'dbo.v1' ; >> /* Notice that the second column still exists as Z1 in the view */ >> SELECT x,z1 FROM dbo.v1 ; >> > > O that is bad!!! I gonne test that immidiatly tuesday. Is it different > when naming the columns? Yes, then you get an error when you try to access the view. Then again, what David's example really shows is that you should use sp_refreshview when you've changed the underlying table, not sp_recompile. To wit, with sp_refreshview, v1 picks up the changed in columns. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Serge Rielau (srielau@ca.ibm.com) writes: > Peter wrote: >> Serge, what do I miss? > I think you are using the wrong tool: > http://msdn2.microsoft.com/en-us/lib...4(SQL.90).aspx > try CREATE SYNONYM I still haven't really understood why Peter want these views. But you are right that if he has a one-to-one mapping from table to view, then synonyms are a better choice. If he is on SQL 2005, that is. (SQL2000 does not have synonyms.) However, there is also the case of partitioned views, where you have a suite of identical tables with a CHECK constraint on the first primary- key column. Such a view would look like: SELECT * FROM sales2000 UNION ALL SELECT * FROM sales2001 UNION ALL ... Of course, you can list all columns here as well, but say that you add two new columns of the same data type to the tables, and when you change the view late Friday afternoon when your mind elsewhere, you end up with: SELECT year, col1, col2, .... newcol1, newcol2 FROM sales2000 UNION ALL SELECT year, col1, col2, .... newcol2, newcol1 FROM sales2000 UNION ALL In unfortunate cases, this can lead to errors that can unnoticed for quite a while, and in the mean while lead to incorrect decisions. The counter-argument to this is that you may only want to add the columns to the sales2006 table, but a SELECT * in the view would force you to add the columns to all tables. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Erland Sommarskog wrote: > I'm not really sure what you mean with clustering factor. To make long story short, suppose you have a Customer table clustered on phone number, having on average 20 rows per page. Suppose you want to retrieve customers with DOB between January 1st and January 15th, which is about 4% of data. Because phone number and date of birth are not correlated, qualifying rows are scattered all over the table, and it is very likely that there is a customer with DOB between January 1st and January 15th on almost every page. So Oracle/DB2 optimizer will look up clustering factor of the index on DOB (it is low) and go for a table scan. On the other hand, phone number and city are very correlated. As a result, if 10% customers live in the city of Someville, the rows matching the criteria city='SOMEVILLE' are located on adjacent pages, because they have phone numbers with the same beginning. The index on city has a high clustering factor. The Oracle/DB2 optimizer will choose to access the table via the index on city, and it will be more efficient than a table scan - almost 90% of data pages will not be read. |
| |||
| Alexander Kuznetsov (AK_TIREDOFSPAM@hotmail.COM) writes: > To make long story short, > suppose you have a Customer table clustered on phone number, having on > average 20 rows per page. Suppose you want to retrieve customers with > DOB between January 1st and January 15th, which is about 4% of data. > Because phone number and date of birth are not correlated, qualifying > rows are scattered all over the table, and it is very likely that there > is a customer with DOB between January 1st and January 15th on almost > every page. So Oracle/DB2 optimizer will look up clustering factor of > the index on DOB (it is low) and go for a table scan. On the other > hand, phone number and city are very correlated. As a result, if 10% > customers live in the city of Someville, the rows matching the criteria > city='SOMEVILLE' are located on adjacent pages, because they have phone > numbers with the same beginning. The index on city has a high > clustering factor. The Oracle/DB2 optimizer will choose to access the > table via the index on city, and it will be more efficient than a table > scan - almost 90% of data pages will not be read. That explains why I haven't heard of it. SQL Server performs no such deliberations, as far as I know. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| Erland Sommarskog wrote: > That explains why I haven't heard of it. SQL Server performs no such > deliberations, as far as I know. Well if on the leaf level of a non clustered index you've got an RID, a structure looking like (extent#, page#, row_on_page#), and you know that matching rows are likely to be stored on adjacent pages, you can traverse a range of the index, sort RIDs of matching rows, and read every page with matching rows only once. This is why clustering factor is a very useful piece of statistics for DB2/Oracle. If on the leaf level of a non clustered index you've got a bookmark, then you just know absolutely nothing about the physical location of the rows you need to retrieve. I don't see any way SQL Server could utilize clustering factor as long as bookmarks are used to access rows from non-clustered indexes (unless it's a heap table, which is not the best practice). Please correct me if I'm wrong. |
| |||
| Erland, I think one more scenario for select * views is when a view is used as a security tool. Suppose I expose a whole table to a user grant select on sales to johnsmith If later on I add a column to sales table, it is visible to johnsmith. Suppose I expose to danbrown only his own sales via a view: create view persons_own_sales as select * from sales where ... grant select on persons_own_sales to danbrown Again, If later on I add a column to sales table, I want it to be visible to danbrown with minimum maintenance - just refresh the view. Makes sense? |
| |||
| Erland, Thanks, I tested it and indeed I did not notice a difference in using sp_recompile or not. And I see also the difference in naming the columns instead of using the asterix. It seems to me the binding is different. With the asterix, the binding is by column number instead of by name. mmmm "Erland Sommarskog" <esquel@sommarskog.se> wrote in message news:Xns97A8690D07743Yazorman@127.0.0.1... > Peter (someone@someplace.com) writes: >>> CREATE TABLE dbo.t1 (x INT NOT NULL PRIMARY KEY, z1 INT NULL); >>> GO >>> CREATE VIEW dbo.v1 AS >>> SELECT * FROM dbo.t1 >>> GO >>> ALTER TABLE dbo.t1 DROP COLUMN z1 ; >>> ALTER TABLE dbo.t1 ADD z2 INT ; >>> GO >>> EXEC dbo.sp_recompile 'dbo.v1' ; >>> /* Notice that the second column still exists as Z1 in the view */ >>> SELECT x,z1 FROM dbo.v1 ; >>> >> >> O that is bad!!! I gonne test that immidiatly tuesday. Is it different >> when naming the columns? > > Yes, then you get an error when you try to access the view. > > Then again, what David's example really shows is that you should use > sp_refreshview when you've changed the underlying table, not sp_recompile. > To wit, with sp_refreshview, v1 picks up the changed in columns. > > > -- > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/pro...ads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| |||
| From this discussion I conclude: 1. Select * in a view is not a performance issue when using a current versions of SQL Server. 2. Select * in a view has benefits and drawbacks in a maintenance aspect. Performance I do not believe it is a big change Microsoft will change this behavior. But, there is always a change. Maintenance benefit In the case I regular want the view to expose all columns of the table, I don not have to name all the columns, but I can simply express what I want. If the underlying table stucture is changed, I simple run exec sp_refreshview and everything is ok again. Maintenance drawback If the underlying table structure is changed: columns are added and columns are deleted but the count of column keeps the same, I do not get a error message when I select the view and there is a change I get wrong results without noticing it in time. I need to be very disciplinic to always use sp_refreshview on all views that do a select * on the changed table. For me, the drawback is a real issue, I like robust code. I also like maintenance simplicity. I need to make a choice. Thanks to all, I know what is the impact of this choice. Unless someone has to add some new point in this discussion :-) |
| |||
| Regarding this one point below... If you always specify the column names then you will get an error instead of getting bad data. It will take only a moment to add or remove a column from your view to correct the error, which you will find immediately because the code will not run. It will take much longer to fix problems caused by bad data that is not discovered for a month. IMHO, you are much, much better off having the entire application stop running while you fix a column name that you overlooked, rather than have the application continue running and corrupt all of your data. "Peter" <someone@someplace.com> wrote in message news:2b3ef$44450c52$50394e09$1850@news.chello.nl.. . > Maintenance drawback > If the underlying table structure is changed: columns are added and columns > are deleted but the count of column keeps the same, I do not get a error > message when I select the view and there is a change I get wrong results > without noticing it in time. > > |
| ||||
| Alexander Kuznetsov (AK_TIREDOFSPAM@hotmail.COM) writes: > I think one more scenario for select * views is when a view is used as > a security tool. Suppose I expose a whole table to a user > > grant select on sales to johnsmith > > If later on I add a column to sales table, it is visible to johnsmith. > Suppose I expose to danbrown only his own sales via a view: > create view persons_own_sales > as > select * from sales where ... > > grant select on persons_own_sales to danbrown > > Again, If later on I add a column to sales table, I want it to be > visible to danbrown with minimum maintenance - just refresh the view. > Makes sense? Yes, I think that I mentioned this in previous post in the thread. Then again, it may be the case that you don't want to expose the columns that controls access. Also, in this case, the maintenance problem is smaller as there is on one SELECT on the view. (I'm assuming the view has something like WHERE user = SYSTEM_USER.) With a partitioned view, you have a repetition with more room for errors. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |