This is a discussion on Avg() not including 0's within the SQL Server forums, part of the Microsoft SQL Server category; --> Hi, I have a sql query like this select avg([mycolumn]) from data where date > '1/5/08' and date < ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, I have a sql query like this select avg([mycolumn]) from data where date > '1/5/08' and date < '1/10/08' group by [mycolumn] order by [mycolumn] desc If all values within that average are numbers, I'm fine. If it's a 0 (not a null, a 0) it doesn't get averaged in. For instance, values 0,1,2 should produce an average of 1. (0+1+2)/3 = 1. But sql is returning a value as if my 0's were nulls and not factored in: (1+2)/2 = 1.5 Does anyone know why this is happening and how to fix it? |
| |||
| "lee.richmond" wrote: > > Hi, > > I have a sql query like this > > select avg([mycolumn]) from data where date > '1/5/08' and date < > '1/10/08' > group by [mycolumn] > order by [mycolumn] desc > > If all values within that average are numbers, I'm fine. If it's a 0 > (not a null, a 0) it doesn't get averaged in. For instance, values > 0,1,2 should produce an average of 1. > > (0+1+2)/3 = 1. > > But sql is returning a value as if my 0's were nulls and not factored > in: > > (1+2)/2 = 1.5 > > Does anyone know why this is happening and how to fix it? Lee, a simple test does not replicate your problem. The simple script below returns an average of 1.0 on SQL Server 7.0, 2000 and 2005. create table #t(v decimal(3,2)) insert into #t values (0) insert into #t values (1) insert into #t values (2) select avg(v) from #t drop table #t If the problem persists, then please post a repro script and the version of SQL Server you are using. -- Gert-Jan |
| |||
| On Wed, 30 Jan 2008 13:18:59 -0800 (PST), lee.richmond wrote: >Hi, > >I have a sql query like this > >select avg([mycolumn]) from data where date > '1/5/08' and date < >'1/10/08' >group by [mycolumn] >order by [mycolumn] desc > >If all values within that average are numbers, I'm fine. If it's a 0 >(not a null, a 0) it doesn't get averaged in. For instance, values >0,1,2 should produce an average of 1. > >(0+1+2)/3 = 1. > > But sql is returning a value as if my 0's were nulls and not factored >in: > > (1+2)/2 = 1.5 > >Does anyone know why this is happening and how to fix it? Hi Lee, I was unable to reproduce this behaviour. Can you post some code (i.e. a full repro script: CREATE TABLE statements, INSERT statements, and the offending query) that I can run on my test server that does show this behaviour on your machine? I suspect something else is biting you, but I have to see a repro to find out what it is. -- Hugo Kornelis, SQL Server MVP My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis |
| |||
| Thanks for the quick responses. I know what was happening here but unfortunately it brings me to another problem. This was a problem with a group by: select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id], [bulk sd cr] order by [bulk sd] desc Screwed up my averages. For it to be proper, it's select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id] order by [bulk sd] desc However, the reason I had [bulk sd cr] in the group by in the first place is because I have a case statement for sorting like: order by case when @sortvar = 1 then [bulk sd cr] when @sortvar = 2 then [kw id] end desc My case statement only works if I have all the fields in that case statement also in the group by. This doesn't make sense to me - why should the order by work fine when it's not a case statement, but break when it is a case statement? On Jan 30, 4:52 pm, Hugo Kornelis <h...@perFact.REMOVETHIS.info.INVALID> wrote: > On Wed, 30 Jan 2008 13:18:59 -0800 (PST), lee.richmond wrote: > >Hi, > > >I have a sql query like this > > >select avg([mycolumn]) from data where date > '1/5/08' and date < > >'1/10/08' > >group by [mycolumn] > >order by [mycolumn] desc > > >If all values within that average are numbers, I'm fine. If it's a 0 > >(not a null, a 0) it doesn't get averaged in. For instance, values > >0,1,2 should produce an average of 1. > > >(0+1+2)/3 = 1. > > > But sql is returning a value as if my 0's were nulls and not factored > >in: > > > (1+2)/2 = 1.5 > > >Does anyone know why this is happening and how to fix it? > > Hi Lee, > > I was unable to reproduce this behaviour. Can you post some code (i.e. a > full repro script: CREATE TABLE statements, INSERT statements, and the > offending query) that I can run on my test server that does show this > behaviour on your machine? > > I suspect something else is biting you, but I have to see a repro to > find out what it is. > > -- > Hugo Kornelis, SQL Server MVP > My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis |
| |||
| lee.richmond wrote: > This was a problem with a group by: > > select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ > group by [kw id], [bulk sd cr] > order by [bulk sd] desc > > Screwed up my averages. For it to be proper, it's > > select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ > group by [kw id] > order by [bulk sd] desc > > However, the reason I had [bulk sd cr] in the group by in the first > place is because I have a case statement for sorting like: > > order by case when @sortvar = 1 then [bulk sd cr] > when @sortvar = 2 then [kw id] > end desc > > My case statement only works if I have all the fields in that case > statement also in the group by. This doesn't make sense to me - why > should the order by work fine when it's not a case statement, but > break when it is a case statement? Presumably the original statement looked like this: select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id] order by case when @sortvar = 1 then [bulk sd cr] when @sortvar = 2 then [kw id] end desc which certainly won't work when @sortvar = 1, so SQL Server probably decides to toss it out entirely. Does this work, instead? Is it what you intended? select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id] order by case when @sortvar = 1 then avg([bulk sd cr]) when @sortvar = 2 then [kw id] end desc |
| |||
| Thanks for the reply Ed. That's definitely what I have been trying to do but it isn't working, which is what led me to get my convoluted incorrect code above. The problem is that the case statement, and no other changes, break this. So: select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id] order by avg([bulk sd cr]) desc end Works perfectly. But change nothing but make the order by a case statement: select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id] order by case when @sortvar = 1 then avg([bulk sd cr]) when @sortvar = 2 then [kw id] end desc end End I get an error saying "ORDER BY items must appear in the select list if SELECT DISTINCT is specified." This doesn't make sense to me because the only change is a case statement. Do you know why this is happening? |
| |||
| lee.richmond (Richmolj@gmail.com) writes: > select distinct [kw id], avg([bulk sd cr]) as [bulk SD] from data$ > group by [kw id] > order by case when @sortvar = 1 then avg([bulk sd cr]) > when @sortvar = 2 then [kw id] end desc > end > > End I get an error saying "ORDER BY items must appear in the select > list if SELECT DISTINCT is specified." This doesn't make sense to me > because the only change is a case statement. Do you know why this is > happening? But that is a vital change. Read the error message again: do you see the ORDER BY item in the SELECT list? Then again, what is that DISTINCT doing there in the first place? I can't see that it fills any function. Since you have a GROUP BY there cannot be any duplicates anyway. However, I think a nicer way to write it is: SELECT [kw id], {bulk SD] FROM (SELECT [kw id], avg([bulk sd cr]) as [bulk SD] from data$ group by [kw id]) AS x ORDER BY case when @sortvar = 1 then [bulk SD] when @sortvar = 2 then [kw id] end desc That thing in parentheses is a derived table. Logically, it is a temp table table within the query, but it's not computed that way; the optimizer will optimize the query as a unit, so there is no overhead. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx |
| ||||
| Erland Sommarskog wrote: > SELECT [kw id], {bulk SD] > FROM (SELECT [kw id], avg([bulk sd cr]) as [bulk SD] > from data$ > group by [kw id]) AS x > ORDER BY case when @sortvar = 1 then [bulk SD] > when @sortvar = 2 then [kw id] > end desc Lee, what are the data types for [kw id] and [bulk SD]? If they're different, then IIRC the CASE will run into weird type-conversion issues as well. |
| Thread Tools | |
| Display Modes | |
|
|