Unix Technical Forum

Avg() not including 0's

This is a discussion on Avg() not including 0's within the SQL Server forums, part of the Microsoft SQL Server category; --> Hi, I have a sql query like this select avg([mycolumn]) from data where date > '1/5/08' and date < ...


Go Back   Unix Technical Forum > Database Server Software > Microsoft SQL Server > SQL Server

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 03-01-2008, 02:48 PM
lee.richmond
 
Posts: n/a
Default Avg() not including 0's

Hi,

I have a sql query like this

select avg([mycolumn]) from data where date > '1/5/08' and date <
'1/10/08'
group by [mycolumn]
order by [mycolumn] desc

If all values within that average are numbers, I'm fine. If it's a 0
(not a null, a 0) it doesn't get averaged in. For instance, values
0,1,2 should produce an average of 1.

(0+1+2)/3 = 1.

But sql is returning a value as if my 0's were nulls and not factored
in:

(1+2)/2 = 1.5

Does anyone know why this is happening and how to fix it?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 03-01-2008, 02:48 PM
Gert-Jan Strik
 
Posts: n/a
Default Re: Avg() not including 0's

"lee.richmond" wrote:
>
> Hi,
>
> I have a sql query like this
>
> select avg([mycolumn]) from data where date > '1/5/08' and date <
> '1/10/08'
> group by [mycolumn]
> order by [mycolumn] desc
>
> If all values within that average are numbers, I'm fine. If it's a 0
> (not a null, a 0) it doesn't get averaged in. For instance, values
> 0,1,2 should produce an average of 1.
>
> (0+1+2)/3 = 1.
>
> But sql is returning a value as if my 0's were nulls and not factored
> in:
>
> (1+2)/2 = 1.5
>
> Does anyone know why this is happening and how to fix it?


Lee,

a simple test does not replicate your problem. The simple script below
returns an average of 1.0 on SQL Server 7.0, 2000 and 2005.

create table #t(v decimal(3,2))
insert into #t values (0)
insert into #t values (1)
insert into #t values (2)
select avg(v) from #t
drop table #t

If the problem persists, then please post a repro script and the version
of SQL Server you are using.

--
Gert-Jan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 03-01-2008, 02:48 PM
Hugo Kornelis
 
Posts: n/a
Default Re: Avg() not including 0's

On Wed, 30 Jan 2008 13:18:59 -0800 (PST), lee.richmond wrote:

>Hi,
>
>I have a sql query like this
>
>select avg([mycolumn]) from data where date > '1/5/08' and date <
>'1/10/08'
>group by [mycolumn]
>order by [mycolumn] desc
>
>If all values within that average are numbers, I'm fine. If it's a 0
>(not a null, a 0) it doesn't get averaged in. For instance, values
>0,1,2 should produce an average of 1.
>
>(0+1+2)/3 = 1.
>
> But sql is returning a value as if my 0's were nulls and not factored
>in:
>
> (1+2)/2 = 1.5
>
>Does anyone know why this is happening and how to fix it?


Hi Lee,

I was unable to reproduce this behaviour. Can you post some code (i.e. a
full repro script: CREATE TABLE statements, INSERT statements, and the
offending query) that I can run on my test server that does show this
behaviour on your machine?

I suspect something else is biting you, but I have to see a repro to
find out what it is.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 03-01-2008, 02:48 PM
lee.richmond
 
Posts: n/a
Default Re: Avg() not including 0's

Thanks for the quick responses. I know what was happening here but
unfortunately it brings me to another problem.

This was a problem with a group by:

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id], [bulk sd cr]
order by [bulk sd] desc

Screwed up my averages. For it to be proper, it's

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id]
order by [bulk sd] desc

However, the reason I had [bulk sd cr] in the group by in the first
place is because I have a case statement for sorting like:

order by case when @sortvar = 1 then [bulk sd cr]
when @sortvar = 2 then [kw id]
end desc

My case statement only works if I have all the fields in that case
statement also in the group by. This doesn't make sense to me - why
should the order by work fine when it's not a case statement, but
break when it is a case statement?




On Jan 30, 4:52 pm, Hugo Kornelis
<h...@perFact.REMOVETHIS.info.INVALID> wrote:
> On Wed, 30 Jan 2008 13:18:59 -0800 (PST), lee.richmond wrote:
> >Hi,

>
> >I have a sql query like this

>
> >select avg([mycolumn]) from data where date > '1/5/08' and date <
> >'1/10/08'
> >group by [mycolumn]
> >order by [mycolumn] desc

>
> >If all values within that average are numbers, I'm fine. If it's a 0
> >(not a null, a 0) it doesn't get averaged in. For instance, values
> >0,1,2 should produce an average of 1.

>
> >(0+1+2)/3 = 1.

>
> > But sql is returning a value as if my 0's were nulls and not factored
> >in:

>
> > (1+2)/2 = 1.5

>
> >Does anyone know why this is happening and how to fix it?

>
> Hi Lee,
>
> I was unable to reproduce this behaviour. Can you post some code (i.e. a
> full repro script: CREATE TABLE statements, INSERT statements, and the
> offending query) that I can run on my test server that does show this
> behaviour on your machine?
>
> I suspect something else is biting you, but I have to see a repro to
> find out what it is.
>
> --
> Hugo Kornelis, SQL Server MVP
> My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 03-01-2008, 02:48 PM
Ed Murphy
 
Posts: n/a
Default Re: Avg() not including 0's

lee.richmond wrote:

> This was a problem with a group by:
>
> select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
> group by [kw id], [bulk sd cr]
> order by [bulk sd] desc
>
> Screwed up my averages. For it to be proper, it's
>
> select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
> group by [kw id]
> order by [bulk sd] desc
>
> However, the reason I had [bulk sd cr] in the group by in the first
> place is because I have a case statement for sorting like:
>
> order by case when @sortvar = 1 then [bulk sd cr]
> when @sortvar = 2 then [kw id]
> end desc
>
> My case statement only works if I have all the fields in that case
> statement also in the group by. This doesn't make sense to me - why
> should the order by work fine when it's not a case statement, but
> break when it is a case statement?


Presumably the original statement looked like this:

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id]
order by case when @sortvar = 1 then [bulk sd cr]
when @sortvar = 2 then [kw id]
end desc

which certainly won't work when @sortvar = 1, so SQL Server probably
decides to toss it out entirely.

Does this work, instead? Is it what you intended?

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id]
order by case when @sortvar = 1 then avg([bulk sd cr])
when @sortvar = 2 then [kw id]
end desc
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 03-01-2008, 02:48 PM
lee.richmond
 
Posts: n/a
Default Re: Avg() not including 0's

Thanks for the reply Ed. That's definitely what I have been trying to
do but it isn't working, which is what led me to get my convoluted
incorrect code above. The problem is that the case statement, and no
other changes, break this. So:

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id]
order by avg([bulk sd cr]) desc
end

Works perfectly. But change nothing but make the order by a case
statement:

select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
group by [kw id]
order by case when @sortvar = 1 then avg([bulk sd cr])
when @sortvar = 2 then [kw id] end desc
end

End I get an error saying "ORDER BY items must appear in the select
list if SELECT DISTINCT is specified." This doesn't make sense to me
because the only change is a case statement. Do you know why this is
happening?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 03-01-2008, 02:48 PM
Erland Sommarskog
 
Posts: n/a
Default Re: Avg() not including 0's

lee.richmond (Richmolj@gmail.com) writes:
> select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$
> group by [kw id]
> order by case when @sortvar = 1 then avg([bulk sd cr])
> when @sortvar = 2 then [kw id] end desc
> end
>
> End I get an error saying "ORDER BY items must appear in the select
> list if SELECT DISTINCT is specified." This doesn't make sense to me
> because the only change is a case statement. Do you know why this is
> happening?


But that is a vital change. Read the error message again: do you see the
ORDER BY item in the SELECT list?

Then again, what is that DISTINCT doing there in the first place? I can't
see that it fills any function. Since you have a GROUP BY there cannot
be any duplicates anyway.

However, I think a nicer way to write it is:


SELECT [kw id], {bulk SD]
FROM (SELECT [kw id], avg([bulk sd cr]) as [bulk SD]
from data$
group by [kw id]) AS x
ORDER BY case when @sortvar = 1 then [bulk SD]
when @sortvar = 2 then [kw id]
end desc

That thing in parentheses is a derived table. Logically, it is a temp
table table within the query, but it's not computed that way; the
optimizer will optimize the query as a unit, so there is no overhead.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 03-01-2008, 02:48 PM
lee.richmond
 
Posts: n/a
Default Re: Avg() not including 0's

That actually makes perfect sense to me, good idea. I'll give it a
try. Thanks!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 03-01-2008, 02:48 PM
Ed Murphy
 
Posts: n/a
Default Re: Avg() not including 0's

Erland Sommarskog wrote:

> SELECT [kw id], {bulk SD]
> FROM (SELECT [kw id], avg([bulk sd cr]) as [bulk SD]
> from data$
> group by [kw id]) AS x
> ORDER BY case when @sortvar = 1 then [bulk SD]
> when @sortvar = 2 then [kw id]
> end desc


Lee, what are the data types for [kw id] and [bulk SD]? If they're
different, then IIRC the CASE will run into weird type-conversion
issues as well.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 08:59 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com