October 1st, 2018 at 11:48:17 PM
Finding Consensus via the Blockchain
The Purpose of Blocks
A block on a blockchain is a snapshot the current state of a system. For instance, in Bitcoin, a mined block represents the concrete balance recorded for every transaction created throughout the entire history of its existence. The past states are recorded by each block's parents, grandparents, great grandparents, etc--where each block represents just the changes in state between each snapshot in time. In this way, the current state of the network can be replicated by replaying each block's changes in sequence. Each of these blocks (or "snapshots") are tied together by included an irrefutable identifier of its parent (its "hash"), where any change--no matter how miniscule--would inherently change its identifier. This property is what inherently grants a blockchain its immutability.
However, what if I were to tell you that despite its inherent immutability, a blockchain's history can indeed be rewritten? Well, I wouldn't be lying to you--in fact, despite the immutability of blockchain and the magic of hashes, this is actually a true statement. A block chain, in fact, isn't perfectly immutable only because it can trace back its ancestry. The factors that make a blockchain immutable are more complex than just a trail of blocks.
You might be thinking to yourself, "Well, Josh, if a blockchain's history can be rewritten, what's preventing you from purchasing an awesome Tesla Model S with Bitcoin, then rewriting history tomorrow so your Bitcoin never left your wallet?" This is a great question, because, obviously, I can't do this--in fact, no one can do this--but the answer to why no one can do this is often misunderstood, and I'll try to explain why.
The Purpose of Proof of Work
So we talked about the purpose of blocks: they're snapshots representing the change of state for the entire network. However, what prevents someone from hammering out a new block every second? The mechanism that regulates block creation is called Proof of Work. In order to create a block, I have to find a number that, when appended to the rest of the contents of a block, makes its hash start with a few zeroes. For instance, say I wanted to create a block that only contained the word apple; the hash of apple is 1f3870be274f6c49b3e31a0c6728957f. Well, that hash doesn't start with any zeroes at all... so let me add a space to the end of it, so my block becomes apple . Well, apple hashes to e307c4bc0265c934316dbd59f86336fd which also doesn't have any zeroes at all in the front... okay, well, how many spaces do I have to append so its hash start with just single a zero? Well, the only way for me to find out is to keep appending spaces to the end of the word, apple. Well, 19 spaces later, I finally have a hash that starts with a single zero. apple hashes to 0203b387ab3da8426f3017f3706cb400 ...great, well, that's not exactly a whole lot of zeroes, so let's keep trying. How many spaces to the end of "apple" do I need to add so that its hash starts with 5 zeroes?
Well, just to find 4 zeroes, it only took me 26,711 spaces which corresponds to 26,711 attempts. When trying to find 5 zeroes, it took 1,270,848 spaces, which took my state-of-the-art laptop over 24 minutes to do. That's a lot of work. It's this "work" that limits the ability for a computer to mine a block. Currently, Bitcoin Cash requires 18 zeroes to find a block, which roughly corresponds to 2,421,181,701,544,813,000,000 attempts. The number of zeroes required to find a block scales with each person attempting to mine a block. Therefore, this difficulty scales to correspond to the number of computers trying to create a new block. (Side note: the hashes I've used here are actually non-cryptographic hashes... which means finding a hash is very very fast, whereas Bitcoin Cash uses cryptographic hashes, which require much more processing power to find... so not only does it require an obscene number of attempts, each attempt is substantially more time consuming than a simple "MD5" hash.)
Public vs Private Blockchain
With a public blockchain, many, many computers are competing to find the next block. This is why Bitcoin Cash requires so many attempts to find a block--the sheer number of people (and specialized hardware) attempting to find a block is massive. The fact that so many people are attempting to mine a block actually secures the network; since finding a block is an expensive process, the costs have to be subsidized by the reward contained within the block--and since invalid blocks do not award any prize, attempting to fraud the network becomes cost-prohibitive.
However, private blockchains are not mined by millions of computers. In fact, it can be as little as 3 (but usually more). When few computers are mining blocks, the difficulty to find a new block is much less difficult, so instead of 2.4x1021 attempts to find a block, it's only about 4,294,967,296.
I promised to explain how a blockchain could have its history rewritten. Let's take the following example.
Let's say we have a private blockchain with 3 miners, collectively performing about 7,158,278 hashes per second. Working together, these three miners find a new block about once every 10 minutes.
Now let's assume the entity mining this chain wants to undo some history; let's assume they want to lie about the past. This entity wants to change the total revenue they claimed from January, and it is now October. How would they attempt to do this?
To accomplish this, they have to change one of their blocks from January so that it appears they've recorded a lesser amount of revenue. But changing a block from January also means they need to modify the rest of the year's blocks so that they reference the "new" January block.
Changing a block's parent, or grandparent, or great grandparent, etc, would change their hash. Recall our original example: we had to add 1,270,848 spaces to the word "apple" so that its hash had five zeroes. However, when recreating each block, we're no longer adding spaces to the word "apple", instead, it's the word "acre". While adding 1,270,848 to the word "apple" created a hash with five zeroes, adding 1,270,848 spaces to "acre" hashes to 85555edf72e732b66d4b252d5be60030 ...which doesn't have any zeroes at all until the last 4 digits.
Because of this, every block must be recomputed. This is an arduous task, but it's doable. In fact, at 1 block per 10 minutes, changing a value from 10 months ago would require rewriting about 2,160,000 blocks. With their 3 miners, it would take them 10 months to rewrite that history. However, what if they bought 3 more computers and had all 6 computers attempting to rewrite history? Well, then it would only take them 6 months. Still a long time, but it's twice as fast! Let's assume that rewriting their revenue saves them millions of dollars, so it's cost effective to spend a lot of money on rewriting their history--so they buy a lot of computers. If they bought 1,000 computers, it would only take them less than a day to rewrite history! This whole private blockchain suddenly doesn't feel so immutable.
However, it can be if the entity publicly releases their block hashes.
In the case that the entity rewrote history, the hash of their newest block won't be the same as the hash of the original chain--it may have the same number of leading zeroes, but the rest of it would be vastly different. Therefore, if the entity publishes the tip of their block chain periodically, no matter how many computers they bought, they would not be able to produce a block with the same hash as the original chain--therefore, anyone possessing their original chain's most-recent block hash would be able to detect they've deviated from their originally claimed past.
The process that we recommend for publishing hashes involves signing the hashes with a key, and broadcasting them to a public blockchain network that has more hashing power than any single entity can feasibly amass. For instance, if the entity publishes their tip block's hash once per hour to the Bitcoin Cash network, the entity would have to accumulate more hashing power than 50% of the entire Bitcoin Cash (which is slightly more than 845,587,940,960 "computers" from the previous example) network just to be able to un-publish their hashes, which simply isn't possible.
Therefore, by periodically publishing the hashes of their private blockchain, any third party is able to trust the private blockchain's history by validating the checkpoints published to a public chain. This mechanism allows the entity the ability to retain ownership and control of their data while still being able to cryptographically prove immutability.