Why should I use Checksum on my database?



Why should I use Checksum on my database? Or rather should I switch it off and gain better performance? This is a very common ambiguity while some one tries to do a trade off between performance and a very useful feature of SQL Server.

First step is to understand CHECKSUM itself to answer this question. CHECKSUM is new feature in SQL2005 that provides a stronger mechanism than torn-page to detect any corruptions in IO/Disc subsystem which may eventually cause loss of data/corrupt a database.

Between the time a database page is written to the disk and subsequently read by SQL Server, it may get corrupted for reasons that are outside the control of SQL Server. Checksum mechanism allows SQL Server to detect the database page corruption under such situations.
 
 
So the advantage of having this mechanism on is to detect any potential issue with I/O (like page corruption where data file is residing) beforehand and then take reactive steps to minimize the hazards of possible data loss.  However checksum does not offer any repair mechanism directly. So once a corruption is detected, administrator is expected to take reactive steps to prevent database corruption/minimize data loss.

When enabled, SQL Server computes the checksum of the page both when a page is written and when a page is read, assuming it has checksum on it. SQL Server computes the checksum based on the bit pattern on the page, stores it in the page header and then issues an I/O to write the page. 
 When the SQL Server reads the page, it re-computes the checksum using the same logic and then compares it with the value available in the page header. If the checksum value matches then it is assumes the page did not get corrupted during the write-read cycle. The algorithm to compute the checksum is complex and therefore expensive to compute. Since the cost of computing the checksum is incurred on each page read and write, it can add to the CPU overhead and can possibly impact the throughput of the server. Another thing to keep in mind is that the checksum is not unique for a specific bit pattern on the page. Two pages can possibly map to the same checksum value. So there is remote possibility that page corruption may go undetected.

From my real life experience, enabling checksum is not really that much over head for a database. However if you are really worried on the possible performance hit, you may want to refer http://sqlblog.com/blogs/linchi_shea/archive/2007/01/16/performance-impact-of-enabling-page-checksum-and-default-trace.aspx for a case study details.

To conclude, I always recommend switching on Checksum option on the databases (as it is on by default for SQL Server, rather I recommend not switching off this option) because it is not really not a very serious overhead but have a larger benefit potential.

Comments