Blockchain Semantics Insights
Business Case | Deep Tech | Announcements | Blockchain Glossary |
Hash Functions And Its ApplicationsBy Anurag Srivastava | May 26, 2018, 7:07 a.m. GMT
Hashing is a mathematical algorithm that takes an input of any length and returns a value of fixed length. The return value is usually referred to as the ‘hash value’ of the input. There are several ways to implement this algorithm.
A perfect hash function is one which returns a unique hash value of fixed length for unique inputs, consistently and asymmetrically. What this means is that if you change even a bit of an input and feed it to a hash function, the resulting hash value would differ. On top of that, it is mathematically impossible to get the value of the input from its hash value. That's a pretty powerful feature, let's explore its real-life utilities.
What's the use of a hash function?
Password verification is by far the most popular application of a perfect hash function. If you have a bit of programming experience, you would be aware that it is a crime to store users' passwords in your database. You don't store it because if your database gets compromised, the hacker would be able to steal these login credentials. Now, if a user is not careful, he/she would have used the same password on other sites too.
So, how do you solve this? By not storing the actual password but storing its hash value in the database at the time of signup. The only purpose of a password is authentication. You take the password submitted by a user, feed it to a perfect hash function and then compare the hash value for the user in the database. If the hashes match, you authenticate the user successfully. To make it more secure, you can add a salt to the password before hashing. Let's say a user wants to keep her password as “anurag”. You take “anurag”, add a salt (say, “complex”) and feed it to a hash function. You then store this hash value in the database. The concept of salt doesn't hold any value in an open source project.
Protection against data tampering is another important application of hashing. Let's say you are reading a rental agreement sent to you by your landlord. These agreements typically run into pages. After going through the agreement, you realize that your name is misspelt and rest is fine. You send it back to your landlord asking for the correction. He/she corrects it and sends it back. What do you do now – do you read the whole document again to ensure that nothing else has changed or do you just verify your name? Noone likes taking chances with agreements.
How do you deal with it smartly? You store the hash value of the first version of the agreement. When you receive the corrected version, you change the name in this document to match the name in the first version of the document. You calculate the hash value of this document. If it matches exactly with the first version, you can be rest assured that nothing else has changed in the agreement.
Blockchain uses the concept of hashing extensively. A block stores the hash value of the block header of the previous block. Using this, the current block links with its previous block. this ensures that whenever the content of its previous block is changed, the link breaks. This is one of the key aspects of how a Blockchain network attains immutability.
Blockchain also uses the concept of hashing in account creation. Remember, how the conversion of public key to its private key is a unidirectional process? You cannot extract a private key from a public key.
Types of hash function
A few important and popular hash functions are SHA256, KECCAK, RIPEMD, MD5. Most of the blockchain networks use SHA256 algorithm. It has never been broken yet. This algorithm is considered a military standard. KECCAK came into existence after SHA256.
The National Institute of Standards and Technology (NIST) conducted a competition to come up with a new hash algorithm. It was conceptualized since the MD5, SHA-0, SHA-1 hash functions had been broken. This step by NIST was a preventive measure in case somebody breaks SHA256. However, no one has yet been successful in breaking it. KECCAK was the winner of this competition. We should not consider KECCAK as the replacement of SHA256, as both are equally secure. If you still want to safeguard your code, you should do a chained hash. Feed your input to SHA256 and then feed its hash value to the KECCAK function. This will ensure the safety of your code even if one of these two hash functions is broken.
Hope you had a great time reading about hash functions. Can you think of any other application of a hash function? Let us know.