This is a discussion on BINARY_CHECKSUM algorithm within the SQL Server forums, part of the Microsoft SQL Server category; --> Hello, Do you know if the algorithm for the BINARY_CHECKSUM function in documented somewhere? I would like to use ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, Do you know if the algorithm for the BINARY_CHECKSUM function in documented somewhere? I would like to use it to avoid returning some string fields from the server. By returning only the checksum I could lookup the string in a hashtable and I think this could make the code more efficient on slow connections. Thanks in advanced and kind regards, Orly Junior |
| |||
| I don't know where the algorithm is documented but I don't think it will help you. There are many more possible strings than checksums so there isn't a one-to-one correspondence between them. Unless your set of possible strings is constrained to a small set you couldn't guarantee to translate a checksum back into a string and you'd still have to maintain a lookup table of strings on the client side. So if your set of strings is small and constrained then you may as well invent your own codes. Alternatively, take a look at Huffman Coding or one of the zip compression algorithms. -- David Portas SQL Server MVP -- |
| |||
| Orly Junior (nomail@nomail.com) writes: > Do you know if the algorithm for the BINARY_CHECKSUM function in > documented somewhere? I would like to use it to avoid returning some > string fields from the server. By returning only the checksum I could > lookup the string in a hashtable and I think this could make the code > more efficient on slow connections. Risky business. The checksum algorithm is fairly simple-minded. I beleive it uses some xor mechanism. I don't have the references around right now, but I recall that SQL Server MVP Steve Kass demonstrated how some quite small changes could result in the same checksum. Better in such case, to augment the table with a timestamp column. Such a column is automatically updated every time SQL Server updates the row. So you could store the timestamp client side, and pass that value, if the table has the same value, there is no need for a refresh. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinf...2000/books.asp |
| ||||
| Orly, Here is what I believe it does for a single varchar column. It's not a particularly good hash function at all. create function binary_checksum_varchar ( @t varchar(1000) ) returns int as begin declare @b bigint set @b = 0 declare @c tinyint declare @s bit set @s = 0 declare @i int set @i = 1 while @i <= len(@t) begin set @c = ascii(substring(@t,@i,1)) set @b = @b / 16 * 16 + @b % 16 ^ @c / 16 set @b = @b * 16 + @c % 16 if @c >= 128 begin set @b = @b ^ 0xFF set @s = 1 - @s end set @b = @b % 0x0100000000 ^ @b / 0x0100000000 set @i = @i + 1 end if @s = 1 set @b = @b ^ 0xFFFFFFFF if @b >= 0x80000000 set @b = @b | 0xFFFFFFFF00000000 return @b end go You'll find more information if some of the threads here: http://groups.google.co.uk/groups?q=...kass+sqlserver Steve Kass Drew University Orly Junior wrote: > Hello, > > Do you know if the algorithm for the BINARY_CHECKSUM function in documented > somewhere? > I would like to use it to avoid returning some string fields from the > server. > By returning only the checksum I could lookup the string in a hashtable and > I think this could make the code more efficient on slow connections. > > Thanks in advanced and kind regards, > > Orly Junior > > > |
| Thread Tools | |
| Display Modes | |
|
|