Error detection and correction

All error-detection and correction methods only work below a certain error rate

If we allow any number of errors in data bits and in check bits, then no error-detection (or correction) method can guarantee to work, since any valid pattern can be transformed into any other valid pattern.

All methods of error-detection and correction only work if we assume the number of bits changed by error is below some threshold. If all bits can be changed, no error detection method can work even in theory. All methods only work below a certain error rate. Above that rate, the line is simply not usable.

General strategy: Coding scheme so that a small no. of errors in this block won't change one legal pattern into another legal pattern:

Frame or codeword length n = m (data) + r (redundant or check bits).
Any data section (length m) is valid (we allow any 0,1 bitstring).
Not every codeword (length n) is valid.
Make it so that:
(no. of valid codewords) is a very small subset of (all possible 0,1 bitstrings of length n)
so very unlikely that even large no. of errors will transform it into a valid codeword.
Price to pay: Lots of extra check bits (high r).
Obviously this works up to some error rate - won't work if no. of errors is large enough (e.g. = n).
Error-check says "I will work if less than p errors in this block"

If errors still getting through: Reduce block size, so will get less errors per block.

Reduce m. Reduce the amount of info you transmit before doing error-checking. This can vastly reduce the probability of multiple errors per block.
e.g. Say we have average 1 error per 1000. Transmit blocks of 10000. Average 10 errors each.
Transmit blocks of 10. Average 1 error per 100 blocks. Almost never 2 errors in a block.

3.2.1 Error-correcting codes

Frame or codeword length n = m (data) + r (redundant or check bits).
All 2^m patterns are legal.
Not all 2ⁿ patterns are legal.

Basic idea: If illegal pattern, find the legal pattern closest to it. That might be the original data (before errors corrupted it).

Given two bitstrings, XOR gives you the number of bits that are different. This is the Hamming distance.
If two codewords are Hamming distance d apart, it will take d one-bit errors to convert one into the other.

To detect (but not correct) up to d errors per length n, you need a coding scheme where codewords are at least (d+1) apart in Hamming distance. Then d errors can't change into another legal code, so we know there's been an error.
To correct d errors, need codewords (2d+1) apart. Then even with d errors, bitstring will be d away from original and (d+1) away from nearest legal code. Still closest to original. Original can be reconstructed.

Error-detection example: Parity bit

Recall modems.
Appended to data so that no. of 1 bits is even (or odd).
Codeword distance 2.
Can detect (but not correct) 1 error.
Can't detect 2 errors.

Error-correction example: Sparse codewords

Let's say only 4 valid codewords, 10 bits:

Minimum distance 5.
Can detect and correct 1,2 errors.
Can detect but not correct 3,4 errors.
Can't detect 5 errors.

Note: Can detect some (even most) 5 bit errors.
Just can't guarantee to detect all 5 bit errors.

e.g. Encode every 2 bits this way.
Need 5 times the bandwidth to send same amount of data.
But can communicate even if almost every other bit is an error (4 out of 10).

Theoretical limit of 1-bit error-correction

Detect and correct all 1 errors.
Need distance 3.

We need to have 2^m legal messages.
If change 1 bit, must get illegal (and an illegal which is 1 bit away from this message, but not 1 bit away from any other legal message).
So each of the 2^m must have n illegal codewords at distance 1 from it (systematically change each bit). These are my illegals, can't overlap with the illegals that are 1 bit away from other patterns.
Each message needs (n+1) patterns reserved for it.
(n+1) 2^m <= 2ⁿ
(n+1) <= 2^n-m
(m+r+1) <= 2^r

For large r, this is always true.
i.e. This is putting a limit on how small r can be.
e.g. Let m=64. Then we need:
r+65 <= 2^r
Remember powers of 2.
i.e. r >= 7

What block size?

It scales well. Let m=1000. then r=10.
Question is, will 1000 bits only have 1 error?

e.g. Could send 1 M bits, need only 20 check bits to error-correct 1 bit error! But if there are 2 errors the system fails.

If errors getting through: Reduce m until almost never get more than 1 error per block.

Hamming code

Error-detection (and re-transmit) v. Error-correction

Example

Errors isolated. 1 in 10⁶.
Block size 1000 bits.
i.e. Error on average 1 bit every 1000 blocks.

To do error-correction on 1000 bit block, need 10 check bits (2¹⁰=1024).
1 M of data needs overhead of 10,000 check bits.
To just error-detect a block with a 1 bit error, need 1 parity bit.
1 M of data needs 1,000 check bits.
Once every 1000 blocks (1 M), 1 block needs to be re-transmitted (extra 1001).
1 M of data needs overhead 2,001 bits.

Q. For what error rate are they equal?

3.2.2 Error-detecting codes

Parity bit for 1 bit error detection

Parity bit can detect 1 error. Won't detect 2. Will detect 3, won't detect 4, etc.
If large block has 1 parity bit and is badly damaged, the odds of detecting this are only 0.5.

Parity bit for n bit burst error detection

Each block is considered as matrix, width n, height k.
Parity bit calculated for each column.
Last row of parity bits appended: (parity bit for col 1, col 2, ..., col n)
Transmit n(k+1) block row by row.

Any burst of length up to n in the data bits will leave at most 1 error in each col.
Can be detected.
If burst is in the parity bits row, they will be wrong and we will detect there has been an error.
Note: Don't know where error was (this is detection not correction). Will ask for re-transmit (don't know that data bits were actually ok).

Q. What about burst of length up to n partly in last row of data and partly in the parity bits row?

For longer bursts, damaging the whole block:

Prob. of any given damaged column having correct parity by chance = 1/2
Prob. of all columns having correct parity by chance = (1/2)ⁿ
Reasonable chance we'll detect it.
(If every parity bit in last line ok, it is prob. because there was no error, rather than by chance.)

Isolated errors

Isolated errors more difficult:
Any 2-bit error, where errors come at time t and t+n, will not be detected.