It uses a simple 32 bit checksum that can be upadted from either end
(inspired by Mark Adler's Adler-32 checksum, documented in Tridgell
and Mackerras, "The Rsync Algorithm", see the URL
ftp://samba.anu.edu.au/pub/rsync.)  Its a lot like the Rabin-Karp
string matching algorithm, where every K characters the Rabin-Karp
function is computed (in this case the rsync checksum) and hashed for
later comparisons against each offset in the second file.  I changed
their hash function a lot, and instead of using the characters
directly, I index into an array of random numbers to get more evenly
distributed bits.  Another property of the checksum which I make use
of, which they did not mention in their paper is that given the
checksum for segment J..K and for segment K..L, you can construct the
checksum for segment J..L in O(1) time.
