Unix Technical Forum

select * in views

This is a discussion on select * in views within the SQL Server forums, part of the Microsoft SQL Server category; --> Well, first of all IBM is not a product. I could very well be an IBM SQL Server expert ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 02-29-2008, 08:07 PM
Serge Rielau
 
Posts: n/a
Default Re: select * in views

Well, first of all IBM is not a product. I could very well be an IBM SQL
Server expert (working for IGS or Websphere for example), but I am a DB2
expert, which I take it was what you meant. :-)

My comments were generic and I'm confident they apply to any SQL RDBMS
out there.

Couple of observations:
1. A thumb rule for using TBSCAN over (index scan & page fetch) is when
the selectivity is worse than, say 5%. Sure any DBSM will pick it's own
secret number, but I don't doubt for a moment that SQL Server, having a
cost based optimizer will eventually say: "Crap. You make me go after
columns not in my index and I'm going to touch next to at least one row
in any page anyway!".
Enabling queries to use covering indexes is important. SELECT * is
counter productive on that end.

2. If all you did in you experiment was a simple CREATE VIEW V AS SELECT
* FROM T then this doesn't say too much.
Thsi is way I made my statement relative. If you SQL is more complex
then mileage will vary depending e.g. on the version of your RDBMS since
you are relying on the optimizer.
SQL is about:
You tell the RDBMS WHAT you want. The RDBMS takes care of the HOW to get it.
As developers it is our responsibility to specify the WHAT correctly.

I'm referring back to the view vs. stored procedure thread here. In that
thread users didn't trust views at all. I don't go that far, but I
wouldn't trust any optimizer ( SQL Server, DB2, you name it) to fix up
all my lazyness through divine intuition.

Lastly I respectfully disagree with the subsequent note in this thread
that over specification of columns in a view has no performance impact.
While this may be true in most cases once the statement invoking the
view is compiled, you are relying on the compiler to optimize out the
unused columns. And that is code path and costs you CPU on first
compile. Whether you can actually "feel" that depends on cache friendly
your app is.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 02-29-2008, 08:07 PM
Erland Sommarskog
 
Posts: n/a
Default Re: select * in views

Serge Rielau (srielau@ca.ibm.com) writes:
> Lastly I respectfully disagree with the subsequent note in this thread
> that over specification of columns in a view has no performance impact.


I think Peter's question actually was whether there was a performance
impact of saying "SELECT *" instead of listing all columns in the table
explicitly, and for all my dislike for SELECT * in production code, I
can't think of any reason why SELECT * should be any more expensive in
SQL Server.

I completely agree with that best is to include exactly those columns
for which there is an actual need.

One problem I often fight at work is that I want to drop a column, or
drastically change the meaning of it. I suspect that it is not really
in use, but still I find a bunch of stored procedures that return this
column in a result set. But I can also see that these procedures aim
at returning the universe, so I have no idea whether the value is used
for something. (Usually, I end up dropping the column anyway.)


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 02-29-2008, 08:08 PM
Alexander Kuznetsov
 
Posts: n/a
Default Re: select * in views

Serge Rielau wrote:
> 1. A thumb rule for using TBSCAN over (index scan & page fetch) is when
> the selectivity is worse than, say 5%. Sure any DBSM will pick it's own
> secret number,


Serge,

I think that threshold selectivity is not a constant - for SQL Server
it also depends on the width of bookmarks. So, if bookmarks are 4 byte
integers, the threshold selectivity is lower than if bookmarks are
50-byte character fields.

Similarly, besides selectivity DB2 also considers clustering factor,
and if an index has a high clustering factor, even low 50% selectivity
might be good enough for an access via an index, correct?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 02-29-2008, 08:08 PM
Peter
 
Posts: n/a
Default Re: select * in views

>
> CREATE TABLE dbo.t1 (x INT NOT NULL PRIMARY KEY, z1 INT NULL);
> GO
> CREATE VIEW dbo.v1 AS
> SELECT * FROM dbo.t1
> GO
> ALTER TABLE dbo.t1 DROP COLUMN z1 ;
> ALTER TABLE dbo.t1 ADD z2 INT ;
> GO
> EXEC dbo.sp_recompile 'dbo.v1' ;
> /* Notice that the second column still exists as Z1 in the view */
> SELECT x,z1 FROM dbo.v1 ;
>
> --


O that is bad!!! I gonne test that immidiatly tuesday. Is it different when
naming the columns?


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 02-29-2008, 08:08 PM
Peter
 
Posts: n/a
Default Re: select * in views


"Erland Sommarskog" <esquel@sommarskog.se> wrote in message
news:Xns97A7C00123E9EYazorman@127.0.0.1...
> Serge Rielau (srielau@ca.ibm.com) writes:
>> Lastly I respectfully disagree with the subsequent note in this thread
>> that over specification of columns in a view has no performance impact.

>
> I think Peter's question actually was whether there was a performance
> impact of saying "SELECT *" instead of listing all columns in the table
> explicitly, and for all my dislike for SELECT * in production code, I
> can't think of any reason why SELECT * should be any more expensive in
> SQL Server.
>
> I completely agree with that best is to include exactly those columns
> for which there is an actual need.
>
> One problem I often fight at work is that I want to drop a column, or
> drastically change the meaning of it. I suspect that it is not really
> in use, but still I find a bunch of stored procedures that return this
> column in a result set. But I can also see that these procedures aim
> at returning the universe, so I have no idea whether the value is used
> for something. (Usually, I end up dropping the column anyway.)
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/pro...ads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinf...ons/books.mspx


Erland,

In my case, the views only virtualizes the tables. The views must have the
same structure as the table. So if there are colums added to the table, the
view needs to be updated to. In that case, you have the same problem as the
table without view. You want to delete a column but you dont know if it is
used anywere.

What I mean is, declaring the view as select * is exactly what I functional
want. Give me all columns of the table, no matter what columns there are.

So, the statement

SELECT * FROM sometable

comes closer to what I want and what I mean than

SELECT
column1,
column2,
column3,
column4
FROM
sometable


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #16 (permalink)  
Old 02-29-2008, 08:08 PM
Peter
 
Posts: n/a
Default Re: select * in views


"Serge Rielau" <srielau@ca.ibm.com> wrote in message
news:4af2kbFst43jU1@individual.net...
> Well, first of all IBM is not a product. I could very well be an IBM SQL
> Server expert (working for IGS or Websphere for example), but I am a DB2
> expert, which I take it was what you meant. :-)
>
> My comments were generic and I'm confident they apply to any SQL RDBMS out
> there.
>
> Couple of observations:
> 1. A thumb rule for using TBSCAN over (index scan & page fetch) is when
> the selectivity is worse than, say 5%. Sure any DBSM will pick it's own
> secret number, but I don't doubt for a moment that SQL Server, having a
> cost based optimizer will eventually say: "Crap. You make me go after
> columns not in my index and I'm going to touch next to at least one row in
> any page anyway!".
> Enabling queries to use covering indexes is important. SELECT * is counter
> productive on that end.
>

I never execute select * from sometable. I only use it in a definition of a
view. If I only select a few columns from that view, I see in the query plan
the select * is never executed. Even I dont even see the name of the view.

So, the query plan shows:

select
column1
from
someviewwithselect*

returns less bytes per row compared to

select
column1,
column2
from
sameviewwithselect*

Also, if the columns I select are in a alternate index, and SQL server uses
that alternate index, SQL server doesnt go to the datapages of the table.
SQL Server gets is data only form the index blocks.

Unless the query plan is lying to me.

So to me te code

create table x (column1, column2, colum3)
select column1 from x

has the same execution plan as

create table x (column1, column2, column3)
create view y as select * from x
select column1 from y


Serge, what do I miss? I dont know what your story about indexes and scans
has to do whith selecting on views.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #17 (permalink)  
Old 02-29-2008, 08:08 PM
Serge Rielau
 
Posts: n/a
Default Re: select * in views

Alexander Kuznetsov wrote:
> Serge Rielau wrote:
>> 1. A thumb rule for using TBSCAN over (index scan & page fetch) is when
>> the selectivity is worse than, say 5%. Sure any DBSM will pick it's own
>> secret number,

>
> Serge,
>
> I think that threshold selectivity is not a constant - for SQL Server
> it also depends on the width of bookmarks. So, if bookmarks are 4 byte
> integers, the threshold selectivity is lower than if bookmarks are
> 50-byte character fields.

Struggling to research bookmarks I ran into:
http://msdn2.microsoft.com/en-US/lib...0(SQL.90).aspx
"Bookmark Lookup is not used in SQL Server 2005. Instead, Clustered
Index Seek and RID Lookup provide bookmark lookup functionality."
Cool, let's bypass that one.

> Similarly, besides selectivity DB2 also considers clustering factor,
> and if an index has a high clustering factor, even low 50% selectivity
> might be good enough for an access via an index, correct?

Certainly clustering will affect the decision. But the point of my note
was not to get into the gory details (see "thumb rule") as to state that
a covering index scan is certainly preferable in the vast majority of
cases over a "index seek and RID lookup" (in SQL Server words, aka
ISCAN/FETCH in DB2 words) and that "index seek and RID lookup" compete
with table scans.
So whenever a query for more columns that it consumes it's at risk of
getting suboptimal plans.

Back to Peter, if you select all rows and all columns in your view, then
what good is it? Are you blindly applying some lesson in some book?
views are used for access control, as shorthands for complex queries and
to hide details of the tables. I don't see how you do any of that...
The simplicity of your views also prevents you from seeing my point that
the optimizer (any optimizer) is fallible and you don't want to find out
about the unused column that wasn't dropped when it's too late.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #18 (permalink)  
Old 02-29-2008, 08:08 PM
Serge Rielau
 
Posts: n/a
Default Re: select * in views

Peter wrote:
> Serge, what do I miss?

I think you are using the wrong tool:
http://msdn2.microsoft.com/en-us/lib...4(SQL.90).aspx
try CREATE SYNONYM

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #19 (permalink)  
Old 02-29-2008, 08:08 PM
Alexander Kuznetsov
 
Posts: n/a
Default Re: select * in views

Serge Rielau wrote:
> Struggling to research bookmarks I ran into:
> http://msdn2.microsoft.com/en-US/lib...0(SQL.90).aspx
> "Bookmark Lookup is not used in SQL Server 2005. Instead, Clustered
> Index Seek and RID Lookup provide bookmark lookup functionality."
> Cool, let's bypass that one.


Interesting. My understanding is that SQL Server 2000 does not store
clustering factor as part of statistics, and the reason is simple: it
makes no sence for bookmark lookups. However, for more efficient RID
lookups, it would make perfect sence to both calculate clustering
factor as part of statistics gathering and have the optimizer use it.
Yet I tried to google up "clustering factor"+"SQL server" and that
brought up nothing relevant to SQL Server.

Does SQL Server 2005 optimizer use clustering factor, and if yes,
what's the proper word for it?

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #20 (permalink)  
Old 02-29-2008, 08:08 PM
Erland Sommarskog
 
Posts: n/a
Default Re: select * in views

Alexander Kuznetsov (AK_TIREDOFSPAM@hotmail.COM) writes:
> Serge Rielau wrote:
>> Struggling to research bookmarks I ran into:
>> http://msdn2.microsoft.com/en-US/lib...0(SQL.90).aspx
>> "Bookmark Lookup is not used in SQL Server 2005. Instead, Clustered
>> Index Seek and RID Lookup provide bookmark lookup functionality."
>> Cool, let's bypass that one.


I think this is mainly a terminology issue. Instead of talking of
bookmark lookups, they present it as a join between the NC index and
the clustered index. But there is no real difference between SQL 2000
and SQL 2005, as far as I understand.

> Interesting. My understanding is that SQL Server 2000 does not store
> clustering factor as part of statistics, and the reason is simple: it
> makes no sence for bookmark lookups. However, for more efficient RID
> lookups, it would make perfect sence to both calculate clustering
> factor as part of statistics gathering and have the optimizer use it.
> Yet I tried to google up "clustering factor"+"SQL server" and that
> brought up nothing relevant to SQL Server.
>
> Does SQL Server 2005 optimizer use clustering factor, and if yes,
> what's the proper word for it?


I'm not really sure what you mean with clustering factor. As for RID
lookups, they only occur with heaps, that is tables without clustered
indexes.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:08 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com