The avalanche effect

A desirable feature of robust hashing algorithms is known as the avalanche effect. A small change in the input should result in a dramatic change in the output. Take for instance the following three examples using output redirection and the GNU md5sum utility present in most distributions of Linux:

$ echo "Hills Like White Elephants by Ernest Hemingway" | md5sum
86db7865e5b6b8be7557c5f1c3391d7a -

$ echo "Bills Like White Elephants by Ernest Hemingway" | md5sum
ccba501e321315c265fe2fa9ed00495c -
$ echo "Bills Like White Buffalo by Ernest Hemingway"| md5sum
37b7556b27b12b55303743bf8ba3c612 -

Changing a word to an entirely different word has the same result as changing a single letter: each hash is entirely different. This is a very desirable property in the case of, say, password hashing. A malicious hacker cannot get it close enough and then try permutations of that similar password. We will see in the following sections, however, that hashes are not perfect.