Unix Technical Forum

Auto created statistics and missing statistics

This is a discussion on Auto created statistics and missing statistics within the SQL Server forums, part of the Microsoft SQL Server category; --> Hello group. I have an issue, which has bothered me for a while now: I'm wondering why the column ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-29-2008, 02:22 AM
Jesper Jensen
 
Posts: n/a
Default Auto created statistics and missing statistics

Hello group.

I have an issue, which has bothered me for a while now:

I'm wondering why the column statistics, which SQL Server wants me to
create, if I turn off auto-created statistics, are so important to the
optimizer?

Example: from Northwind (with auto create stats off), I do the following:

SELECT * FROM Customers WHERE Country = 'Sweden'

My query plan show a clustered index scan, which is expected - no index
exists for Country. BUT, the query plan also shows, that the optimizer is
missing a statistic on Country, which tells me, that the optimizer would
benefit from knowing this.

I cannot see why? (and I've been trying for a while now).

If I create the missing statistics, nothing happens in the query plan (and
why should it?). I could understand it, if the optimizer suggested an index
on Country - this would make sense, but if creating the missing index, query
analyzer creates the statistics with an empty index, which seems to me to be
less than usable.

I've been thinking long and hard about this, but haven't been able to reach
a conclusion It has some relevance to my work, because allowing the
optimizer to create missing statistics limits my options for designing
indexes (e.g. covering) for some rather wide tables, so I'm thinking why not
turn it off altogether. But I would like to know the consequences - hope
somebody has already delved into this, and knows a good explanation.

Rgds
Jesper


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-29-2008, 02:23 AM
Simon Hayes
 
Posts: n/a
Default Re: Auto created statistics and missing statistics


"Jesper Jensen" <moellemand@post.tdcadsl.dk> wrote in message
news:40727f2e$0$237$edfadb0f@dread12.news.tele.dk. ..
> Hello group.
>
> I have an issue, which has bothered me for a while now:
>
> I'm wondering why the column statistics, which SQL Server wants me to
> create, if I turn off auto-created statistics, are so important to the
> optimizer?
>
> Example: from Northwind (with auto create stats off), I do the following:
>
> SELECT * FROM Customers WHERE Country = 'Sweden'
>
> My query plan show a clustered index scan, which is expected - no index
> exists for Country. BUT, the query plan also shows, that the optimizer is
> missing a statistic on Country, which tells me, that the optimizer would
> benefit from knowing this.
>
> I cannot see why? (and I've been trying for a while now).
>
> If I create the missing statistics, nothing happens in the query plan (and
> why should it?). I could understand it, if the optimizer suggested an

index
> on Country - this would make sense, but if creating the missing index,

query
> analyzer creates the statistics with an empty index, which seems to me to

be
> less than usable.
>
> I've been thinking long and hard about this, but haven't been able to

reach
> a conclusion It has some relevance to my work, because allowing the
> optimizer to create missing statistics limits my options for designing
> indexes (e.g. covering) for some rather wide tables, so I'm thinking why

not
> turn it off altogether. But I would like to know the consequences - hope
> somebody has already delved into this, and knows a good explanation.
>
> Rgds
> Jesper
>
>


http://msdn.microsoft.com/library/de.../statquery.asp

Simon



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-29-2008, 02:23 AM
Jesper Jensen
 
Posts: n/a
Default Re: Auto created statistics and missing statistics

Thanks, Simon, informative article, but ...

.... it doesn't really explain the stuff, that I wrote. The closest I get to
an explanation, when reading this is 'These statistics are created for
columns where the optimizer would have to estimate the approximate density
or distribution otherwise'.

I knew this, but I still do not know, why the optimizer needs to know the
density and/or distribution?? I can see no valid reason, and therefore I can
see no good reason for enabling auto-creation of stats.

What I probably looking for is a good example, where the use of an
automatically created stat saves time, cycles and IOs

Best Rgds - Jesper

"Simon Hayes" <sql@hayes.ch> skrev i en meddelelse
news:4072f05a$1_2@news.bluewin.ch...
>
> "Jesper Jensen" <moellemand@post.tdcadsl.dk> wrote in message
> news:40727f2e$0$237$edfadb0f@dread12.news.tele.dk. ..
> > Hello group.
> >
> > I have an issue, which has bothered me for a while now:
> >
> > I'm wondering why the column statistics, which SQL Server wants me to
> > create, if I turn off auto-created statistics, are so important to the
> > optimizer?
> >
> > Example: from Northwind (with auto create stats off), I do the

following:
> >
> > SELECT * FROM Customers WHERE Country = 'Sweden'
> >
> > My query plan show a clustered index scan, which is expected - no index
> > exists for Country. BUT, the query plan also shows, that the optimizer

is
> > missing a statistic on Country, which tells me, that the optimizer would
> > benefit from knowing this.
> >
> > I cannot see why? (and I've been trying for a while now).
> >
> > If I create the missing statistics, nothing happens in the query plan

(and
> > why should it?). I could understand it, if the optimizer suggested an

> index
> > on Country - this would make sense, but if creating the missing index,

> query
> > analyzer creates the statistics with an empty index, which seems to me

to
> be
> > less than usable.
> >
> > I've been thinking long and hard about this, but haven't been able to

> reach
> > a conclusion It has some relevance to my work, because allowing the
> > optimizer to create missing statistics limits my options for designing
> > indexes (e.g. covering) for some rather wide tables, so I'm thinking why

> not
> > turn it off altogether. But I would like to know the consequences - hope
> > somebody has already delved into this, and knows a good explanation.
> >
> > Rgds
> > Jesper
> >
> >

>
>

http://msdn.microsoft.com/library/de.../statquery.asp
>
> Simon
>
>
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-29-2008, 02:23 AM
Simon Hayes
 
Posts: n/a
Default Re: Auto created statistics and missing statistics


"Jesper Jensen" <moellemand@post.tdcadsl.dk> wrote in message
news:4072f91b$0$300$edfadb0f@dread12.news.tele.dk. ..
> Thanks, Simon, informative article, but ...
>
> ... it doesn't really explain the stuff, that I wrote. The closest I get

to
> an explanation, when reading this is 'These statistics are created for
> columns where the optimizer would have to estimate the approximate density
> or distribution otherwise'.
>
> I knew this, but I still do not know, why the optimizer needs to know the
> density and/or distribution?? I can see no valid reason, and therefore I

can
> see no good reason for enabling auto-creation of stats.
>
> What I probably looking for is a good example, where the use of an
> automatically created stat saves time, cycles and IOs
>
> Best Rgds - Jesper
>


OK, here's another informative article :-)

http://www.winnetmag.com/SQLServer/A...075/22075.html

In summary, index statistics exist only for the first column in an index,
but auto-created (or manually created) statistics can exist for any column.
This gives the optimizer extra information, which might mean it chooses a
different, more efficient index for a query.

Check out the example on the second page of the article - on my system, this
reduced the logical reads required for the query from 104 to 43.

But you're correct to consider that there can be an impact on performance in
some situations:

http://support.microsoft.com/default...b;en-us;195565

Simon


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-29-2008, 02:23 AM
Jesper Jensen
 
Posts: n/a
Default Re: Auto created statistics and missing statistics

Thanks, Simon, that one did the trick.

One less mystery.

On my machine, QA tells me that the two queries (the index scan on
ProductID/Quantity vs. the clustered index scan) takes 43.65 and 56.35% cost
respectively. I would argue, that this saving is not worth the 'used up
index space'. In my professional life, I've seen tables, which are wide
enough (200+ columns) to demand, that precious index space is saved.

Basically, I think there are too many 'ifs' before an auto-created index
saves performance, but I appreciate the optimization idea behind it.

Thanks - Jesper

"Simon Hayes" <sql@hayes.ch> skrev i en meddelelse
news:4073121e_1@news.bluewin.ch...
>
> "Jesper Jensen" <moellemand@post.tdcadsl.dk> wrote in message
> news:4072f91b$0$300$edfadb0f@dread12.news.tele.dk. ..
> > Thanks, Simon, informative article, but ...
> >
> > ... it doesn't really explain the stuff, that I wrote. The closest I get

> to
> > an explanation, when reading this is 'These statistics are created for
> > columns where the optimizer would have to estimate the approximate

density
> > or distribution otherwise'.
> >
> > I knew this, but I still do not know, why the optimizer needs to know

the
> > density and/or distribution?? I can see no valid reason, and therefore I

> can
> > see no good reason for enabling auto-creation of stats.
> >
> > What I probably looking for is a good example, where the use of an
> > automatically created stat saves time, cycles and IOs
> >
> > Best Rgds - Jesper
> >

>
> OK, here's another informative article :-)
>
> http://www.winnetmag.com/SQLServer/A...075/22075.html
>
> In summary, index statistics exist only for the first column in an index,
> but auto-created (or manually created) statistics can exist for any

column.
> This gives the optimizer extra information, which might mean it chooses a
> different, more efficient index for a query.
>
> Check out the example on the second page of the article - on my system,

this
> reduced the logical reads required for the query from 104 to 43.
>
> But you're correct to consider that there can be an impact on performance

in
> some situations:
>
> http://support.microsoft.com/default...b;en-us;195565
>
> Simon
>
>



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-29-2008, 02:24 AM
Simon Hayes
 
Posts: n/a
Default Re: Auto created statistics and missing statistics

"Jesper Jensen" <moellemand@post.tdcadsl.dk> wrote in message news:<40732877$0$274$edfadb0f@dread12.news.tele.dk >...
> Thanks, Simon, that one did the trick.
>
> One less mystery.
>
> On my machine, QA tells me that the two queries (the index scan on
> ProductID/Quantity vs. the clustered index scan) takes 43.65 and 56.35% cost
> respectively. I would argue, that this saving is not worth the 'used up
> index space'. In my professional life, I've seen tables, which are wide
> enough (200+ columns) to demand, that precious index space is saved.
>
> Basically, I think there are too many 'ifs' before an auto-created index
> saves performance, but I appreciate the optimization idea behind it.
>
> Thanks - Jesper
>


<snip>

Well, you have to be careful about reaching conclusions based on
simple queries using small data sets. It's possible that a complex
join involving millions of rows would give a more significant
difference. To get a definite answer for your environment, you would
have to do some benchmarking, with and without statistics.

Simon
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 02:22 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com