Computerworld - Whenever we send data -- whether it's audio signals over a phone line, a data stream or a legal document -- to someone else, we need to know that what arrives on the other end is identical to what we sent. Similarly, whenever we store data on disk or tape, we need assurance when we retrieve it that it hasn't been altered. Accurate data is absolutely essential for computations, record keeping, transaction processing and online commerce.
Unfortunately, storing and transmitting data both involve the actions of physical entities in the real world: electrons, photons, atoms, molecules, wires, contacts and more. This means there's always some degree of uncertainty because background noise is ever present in our physical universe and might alter or corrupt any given data bit.
Error Detection
Early in the computer revolution, some powerful techniques were developed first to detect and later to correct errors in data. The most obvious, and perhaps least efficient, way to find data changes is to repeat each unit of data multiple times and then compare the copies. This method is so inefficient that it's not used for error detection -- though the same idea is used in RAID-1 (disk mirroring) for fault tolerance.
The best-known error-detection method is called parity, where a single extra bit is added to each byte of data and assigned a value of 1 or 0, typically according to whether there is an even or odd number of "1" bits. The receiving system calculates what the parity bit should be and, if the result doesn't match, then we know that at least one bit has been changed, but we don't know which bit is wrong. It's also possible that the data is entirely correct and the parity bit is garbled. If two bits have been altered, however, the changes cancel out: the data will be wrong but the parity bit won't signal an error. (See Finding a 2-bit error for more details.)
Two other established error-detection techniques are checksum (add up all the bits of the entire message, document or program and produce a single sum) and cyclic redundancy check, which operates on groups of bits at a time and uses division, not addition. Checksums and CRCs are calculated before and after transmission or duplication and then compared. However, checksums and CRCs alone can't verify data integrity, since the algorithms are known and it's possible to introduce intentional changes that these methods won't detect. A more secure way would involve cryptographic hash functions, one-way mathematical operations whose use of secret encryption keys precludes making undetectable alterations.


- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- Thinking Outside The Data Warehouse
- This high level, business problem focused eBook uses 5 customer scenarios to show how people and organizations are tackling real issues using IBM...
- Using BD for Smarter Decision Making
- This paper looks at new developments in business analytics and discusses the benefits analyzing big data bring to the business.
- Measuring the Business Value of CI in the Data Center
- One of the key strategies that IT teams are pursuing to reduce capital costs while boosting asset utilization and employee productivity is the...
- Switching Schedulers - Not As Complicated As You Think
- Changing or consolidating job schedulers may seem daunting. However, the benefits of switching to enterprise workload automation outweigh the risks. Read how BMC...
- Capture-Enabled Business Process Management
- Organizations today must deal with a vast amount of incoming information from many different sources. Efficient, automated business processes are critical to managing... All BI and Analytics White Papers
- InfoSphere Warehouse Packs Demo
- These flash modules make warehousing more tangible and relevant to business users through detailed explanations of the InfoSphere Warehouse Packs.
- Delivery Management -- Extending Lifecycle Management
- Date: Wednesday, June 20, 2012, 1:00 PM EDT
Siloed organizations continue doing the wrong things and doing things wrong, leading to increased costs,... - Leverage automation today to reduce IT complexity
- Date: Tuesday, June 5, 2012, 2:00 PM EDT
Whether your B2B complexity is caused by multiple technologies due to M&A, business or application specific... - BMC Control-M - Single Point of Control Demo
- With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's...
- BMC Control-M - Single Point of Control Demo
- With BMC Control-M, you schedule and manage everything - down to the very last platform and application - from one simple interface. It's... All BI and Analytics Webcasts