BASICS OF WORKING WITH SQL SERVER CHECKSUM
This Article talks about what SQL Server CHECKSUM and SQL Server BINARY CHECKSUM are and how they can be used effectively in SQL Server.
In simple terms, we can describe CHECKSUM as “a calculation performed on a sequence of characters or on a string to yield a single integer value, which will differ for every different sequence“. This would greatly help us in our day to day activities of comparing data in different systems or different set of tables, and also to retrieve data more quickly for a matched or unmatched set of records.
With respect to SQL Server, CHECKSUM would return integer value computed over a row in a table, or for a particular cell, or for an expression, or over a list of expressions. Or in other words CHECKSUM computes a hash value, called the checksum, over its list of arguments. Below is an example where we are taking a string as an argument for the checksum function.
select CHECKSUM('abcdef') as ChecksumValue
Create a temporary Table
Insert data for test
CREATE TABLE #ChecksumTest( col1 INT, col2 VARCHAR(10)) INSERT INTO #ChecksumTest VALUES(1,'abc') INSERT INTO #ChecksumTest VALUES (2,'bcd')
– Check the check sum value
SELECT CHECKSUM(*) AS ChecksumValue FROM #ChecksumTest
We can make use of check sum for equality searches. We have to keep one thing in mind over here while using this; the check sum value that we get is purely dependent on the order of columns mentioned. The point which is also to be noted over here is, If we are comparing two columns or two expressions, the NULL values of the same data type are considered as same!!!
Now, taking an example for the NULL value comparison, let us take the above temp table and insert two records with NULL values and then see what would be the results, frankly, this makes life easier as we have a good standard method to compare NULL’s!!!
We can compare NULLS Also
insert into #ChecksumTest values(3,NULL) select CHECKSUM(col2)as ChecksumValue from #ChecksumTest where col1 = 3
And the results for the above would be
insert into #ChecksumTest values(4,NULL) select CHECKSUM(col2)as ChecksumValue from #ChecksumTest where col1 = 4
And the results for the above would be
Extreme Fast Indexes with CHECKSUM
One of the major uses of check sum is for building Hash Indexes. Actually, the checksum computes a hash value depending upon the parameters/arguments we pass (In the example 2, we passed “*” all columns). If we build columns in a table purely with these hash values and if we index these columns, they are called Hash Indexes.
Hash Indexes can be effectively used for equality searches, so for example, whenever we have strings with length of more than 50 and we have to equality search on those columns (generally for speeding up queries we build Indexes, but Indexes are not recommended for length of >50), then we can build these Hash columns and can perform search based on this column, as Indexes on Integer columns are more effective than Indexes on character/ string columns.
Now, let us try the above by trying it out with an example, continuing with the above temporary table created…
ALTER TABLE #ChecksumTest ADD cs_col2 AS CHECKSUM(col2); GO CREATE INDEX NIX_Col2 ON #ChecksumTest (col2); GO
It returns a binary checksum value computed over a row of a table or over a list of expressions. This is generally used to check if the data in a particular row of a table has changed.
For example, continuing with the above created temporary table, run the script below and observe the difference before updating the records and after updating the records
SELECT BINARY_CHECKSUM(*)as BinaryChecksum from #ChecksumTest where col1=1; GO UPDATE #ChecksumTest set col2 = 'TEST'where col1 = '1'; GO SELECT BINARY_CHECKSUM(*)as BinaryChecksum from #ChecksumTest where col1=1; GO
CHECKSUM and BINARY_CHECKSUM are similar functions, but, return different values for the string and character data types. SQL Server settings can cause strings with different representation to compare equal.
Where we cannot Use Checksum
While using, one important point that has to be kept in mind would be that it is not compatible with text, ntext, image, cursor, xml. When used, the system may either ignore these columns or throws an error if these are the only columns given as arguments.