In a Hash: #1 - Cryptographic Hash Functions

Tom Doll
4 min readAug 10, 2018

The In a Hash educational series is designed to deconstruct the under workings of cryptocurrency and public blockchain technology by offering introductory “bite sized” system design definitions and explanations.

In order to truly understand public blockchain technology, and cryptocurrencies in general, it is important to have a basic understanding of the underlying concepts and functions upon which these systems are built. Therefore, this first post in the In a Hash educational series will attempt to “lay the foundations” by outlining a type of mathematical function, known as a cryptographic hash function, that is utilized in almost all aspects of public blockchain technology.

Cryptographic Hash Functions

Cryptographic hash functions can be somewhat tricky to understand for non-computer science majors like myself so I will simplify things as best I can. Firstly, let me define a general hash function as a mathematical algorithm often used in computing to map data of any size to data of a fixed size. So regardless of the amount of data that is “fed” into a hash function, the data string that comes out the other side is fixed to a predetermined number of alphanumeric characters. For example:

The returned data strings, which in this case consist of 40 alphanumeric characters, are commonly known as hash values, hash sums, hash digests, or hashes. As the output hash is generally much smaller than the input data, it’s easiest to think of a hash function as compressing the original data much like a zip file on your computer.

Public blockchain technology uses a specific type of hash function known as a cryptographic hash function. As with the general algorithm, the cryptographic hash function maps data of any size to data of a fixed size and it also makes this process impossible to invert.

For example, it is not possible to look at 8CjqTt9Tv5YNSVsSsXC976ApTz879z1Q355jUl41 and recreate my old 1000 word school history essay. In fact, reconstruction of the original input data from the output hash is only possible via what is known as a brute-force search which entails “feeding” all the possible input combinations into the cryptographic hash function until a matching hash is outputted. As you can imagine this is not an easy or quick task.

A cryptographic hash function has several key characteristics:

1. The same input data always produces the same output hash

2. A small change to the input data alters the output hash comprehensively

3. Hashes are unique and “collision-free” (i.e. it is not feasible for two different data inputs to have the same output hash)

4. Given any input data string the hash function computation is relatively fast

5. It is not possible to recreate the input data from its output hash except by a brute-force search through all the possible input combinations until a matching hash is outputted thus identifying the original input data

To illustrate points 1 & 2 please see the diagram below:

Applications

Cryptographic hash functions are useful in a variety of different applications. For example, they are used to verify the authenticity of data files or messages. By calculating, and comparing, the hash digests of a file before transmission and after receipt any alterations to the original file may be detected.

In a similar fashion, most digital signature systems utilize a cryptographic hash function to verify message integrity. This topic will be discussed at greater length in a later In a Hash educational series post.

In addition, hashes of data files are often used within computer software for data indexing and swift data lookup. A file’s hash serves as a reliable and efficient file identifier saving data storage space and reducing lookup time due to its small size.

Finally, because cryptographic hashes hide the original input data, they are utilized in the majority of password verification systems (e.g. online banking login) as storing passwords in “clear text” poses a security risk should a password file be compromised.

Coming full circle, cryptographic hash functions also play a very important role in public blockchain technology where they are utilized as “hash pointers” to help establish a link between chunks, or blocks of data. This is in fact where block”chain” technology gets its name. Hash pointers will be discussed in detail in the next In a Hash educational series post.

--

--