Anatomizing a blockchain consensus mechanism
A fundamental problem in large-scale distributed systems is how to achieve overall system reliability in the presence of failures. Systems need to be fault-tolerant. This requires a process for distributed, often heterogeneous systems to reach a consensus and agree on the network state, whether it is a database commit or an action to take. In this section, we will discuss two types of consensus algorithms, PoW and PoS.
What is consensus?
Consensus in a blockchain is the process by which a network of mutually distrusted nodes reaches an agreement on the global state of the chain of blocks. In blockchain, transactions or data are shared and distributed across the network. Every node has the same copy of the blockchain data. Consensus allows all of the network nodes to follow the same rules to validate transactions and add new blocks to the chain, and therefore allows it to maintain uniformity in all copies of a blockchain.
Sometimes, it is also called a consensus mechanism or consensus algorithm. A consensus mechanism focuses on the rules and incentives for the network to reach an agreement. A consensus algorithm is a formal procedure or computer program for solving a consensus problem, based on conducting a sequence of specified actions. It is designed to achieve reliability in a network involving multiple nodes. Consensus algorithms ensure that the next block in a blockchain is fully validated and secured. Multiple kinds of consensus algorithms currently exist, each with different fundamental processes. Different blockchain platforms may implement different consensus mechanisms. In this section, we will focus on the following two popular algorithms, show how they work, and discuss the pros and cons of each mechanism:
- PoW: This consensus algorithm was first coined and formalized in a 1999 paper by Markus Jakobsson and Ari Juels. It got popularized by Satoshi in the Bitcoin whitepaper. It was commonly adopted by many other blockchains, including Ethereum 1.0. The PoW is the mining process with the purpose of finding an answer to a cryptographic hashing problem. To do so, the miner has to follow the block selection rules to locate the previous block and use the hash from the previous block header, together with the Merkle root of current transactions in the new block, to solve the hashing problem. It requires considerable computations and hashing power. In Bitcoin, block selection rules specify that the longest chain wins.
- PoS: This consensus algorithm aims to select network nodes to propose new blocks using various combinations of random selection based on their wealth or age (the stake). Instead of miners competing to solve energy-consuming cryptographic hash functions, the network instead uses a pool of validators. Validators are network nodes that are willing to stake their cryptocurrency on the new block that they claim should be added to the public blockchain.
Let us get into the details of how PoW and PoS actually work in the following subsections.
Proof-of-work
Proof-of-work, also referred to as PoW, is the most popular consensus algorithm used by blockchain and cryptocurrencies such as Bitcoin and Ethereum 1.0, each one with its own differences. We will talk about the specific implementation of PoW in Bitcoin and Ethereum in later sections.
How PoW works
PoW, in terms of protocol design, is an intensive computation game among all miners in the network. The problem to be solved is a cryptographic puzzle. Behind the game theory, it is the incentive system that rewards the winners with bitcoins for contributing new blocks into the blockchain. As shown in the following picture, miners collect all pending transactions from the transaction pool and race against each other to solve the cryptographic puzzle. The miner solving the puzzle will create the new block and publish it into the network for verification from other nodes. Once verified, all nodes can add the new block to their own copy of the blockchain:
Figure 1.14 – How PoW works
The cryptographic puzzle that miners race to solve is identifying the value of the nonce. A nonce is an attribute in the block header structure. In the beginning, each miner guesses a number to start with, checking whether the resulting hash value is less than the blockchain specific target. Bitcoin uses the SHA-256 algorithm for this. SHA-256 outputs a fixed-length number. Every number between 0 to 232 has the same chance to solve the puzzle, therefore a practical approach is to loop through from 0 to 232 until a number can meet the criteria, as shown in the following diagram:
Figure 1.15 – PoW mining process
Once a miner finds the nonce, the results, including the previous block’s hash value, the collection of transactions, the Merkle root of all transactions in the block and the nonce, are broadcasted to the network for verification. Upon being notified, the other nodes from the network automatically check whether the results are valid. If the results are valid, they add the block to their copies of the blockchain, stop the mining work in hand, and move on to the next block.
Targets and difficulty
A target is a blockchain-specific 256-bit number that the network sets up for all miners. The SHA-256 hash of a block’s header — the nonce plus the rest of the block header — must be lower than or equal to the current target for the block to be accepted by the network.
The difficulty of a cryptographic puzzle depends on the number of leading zeros in the target. The lower the target, the more difficult it is to generate a block. Adding leading zeros in the target number will increase the difficulty of finding such a nonce exponentially. As you can imagine, the higher the difficulty setting, the more difficult it will be to evaluate the nonce. Adding one leading zero in the target will reduce by 50% the chance of finding the nonce. The difficulty is decided by the blockchain network itself. The basic rule of thumb is to set the difficulty proportionally to the total effort on the network. If the number of miner nodes doubles, the difficulty will also double. The difficulty is periodically adjusted to keep the block time around the target time. In Bitcoin, it is 10 minutes.
Incentives and rewards
The winner of the cryptographic puzzle usually needs to expend huge amounts of energy and crucial CPU time to find the nonce and win the chance to create new blocks in the blockchain. The reward for such actions depends on the blockchain itself. In the Bitcoin blockchain, the winner is rewarded with Bitcoin, the cryptocurrency of the Bitcoin blockchain.
The PoW consensus is a simple yet reliable mechanism to maintain the state of the blockchain. It is simple to implement. It is a democratic lottery-based system that lets you participate in the game of mining and get the rewards, where every node can join and higher CPU power may not translate into higher rewards. Currently, the winning miner is rewarded with 6.25 BTC for each block created in the Bitcoin blockchain.
Double-spend issues
Satoshi’s original intention in using a PoW mechanism is to solve double-spend issues and ensure the integrity of the global state of the Bitcoin blockchain network. Let’s say Alice sends 10 BTC to Bob, and at the same time or later on she pays Catherine the same 10 BTC. We could end up with the following three situations:
- The first transaction goes through the PoW and is added to the blockchain when the second transaction is submitted. In this case, the second one will be rejected when miners pull it from the transaction pool and validate it against all parent blocks.
- Both transactions are submitted simultaneously and both go into the unconfirmed pool of transactions. In this case, only the first transaction gets a confirmation and will be added in the next block. Her second transaction will not be confirmed as per validation rules.
- Both get confirmed and are added into competing blocks. This happens when miners take both transactions from the pool and put them into competing blocks. The competing blocks form a temporary fork on the blockchain. Whichever transaction gets into the longest chain will be considered valid and spent, and the other one within the block on the short chain will be recycled. When it is reprocessed, it will be rejected since it is already spent. In this case, it may take a few blocks to get the other one recognized as the double-spent one.
Double spend is a technical flaw in all digital currencies prior to Bitcoin, where the same unit of digital currency could potentially be used in transactions multiple times. Bitcoin’s solution in addressing double-spend issues paved the way for Bitcoin to be the true digital currency.
Advantages and disadvantages
However, there are a few drawbacks to the PoW algorithm due to the economic cost of maintaining the blockchain network safety:
- Energy consumption: PoW consensus, which uses a network of powerful computers to secure the network, is extremely expensive and energy-intensive. Miners need to use specialized hardware with high computing capacity in order to perform mining and get rewards. A large amount of electricity is required to run these mining nodes continuously. Some people also claim these cryptographic hash calculations are useless as they can’t produce any business value. At the end of 2018, the Bitcoin network across the globe used more power than Denmark.
- Vulnerability: PoW consensus is vulnerable to 51% attacks, which means, in theory, dishonest miners could gain a majority of hashing power and manipulate the blockchain to their advantage.
- Centralization: Winning a mining game requires specified and expensive hardware, typically an ASIC type of machine. Expenses grow unmanageable, and mining becomes possible only for a small number of sophisticated miners. The consequence of this is a gradual increase in the centralization of the system, as it becomes a game of riches.
On the flip side, it requires huge computing power and electricity to take over the PoW-based blockchain. Therefore, PoW is perceived as an effective way to prevent Denial-of-Service (DoS) and Distributed Denial-of-Service (DdoS) attacks on the blockchain.
Proof-of-stake
As opposed to PoW consensus, where miners are rewarded for solving cryptographic puzzles, in the PoS consensus algorithm, a pool of selected validators each take turns proposing new blocks. The validator is chosen in a deterministic way, depending on its wealth, also defined as a stake. Anyone who deposits their coins as a stake can become a validator. The chance to participate may be proportional to the stakes they put in. Let’s say, Alice, Bob, Catherine, and David stake 40 ether, 30 ether, 20 ether, and 10 ether respectively to participate; they will get a 40%, 30%, 20%, and 10% chance of being selected as the block creator.
The following is how it works in the PoS consensus mechanism:
Figure 1.16 – How PoS works
As shown in the preceding diagram, the blockchain keeps track of a set of validators. Depending on their roles in creating new blocks, sometimes the validator is also called block creator, builder, or proposer. At any time, whenever new blocks need to be created, the blockchain randomly selects a validator. The selected validator verifies the transactions and proposes new blocks for all validators to agree on. New blocks are then voted on by all current validators. Voting power is based on the stake the validator puts in. Whoever proposes invalid transactions, blocks, or votes maliciously, which means they intentionally compromise the integrity of the chain, may lose their stakes. Upon the new blocks being accepted, the block creator can collect the transaction fee as the reward for the work of creating new blocks.
PoS is considered more energy efficient and environmentally friendly compared with the PoW mechanism. It is also perceived as more secure too. It essentially reduces the threat of a 51% attack since malicious validators would need to accumulate more than 50% of the total stakes in order to take over the blockchain network.
Similar to PoW, total decentralization may not be fully possible in the PoS-based public blockchain. This is because a few wealthy nodes can monopolize the stakes in the network. Those who put in more stakes can effectively control most of the voting. Both algorithms are subject to the socio-economic issue of making the rich richer.
PoS is getting more popular these days, due to social economical perspective and scalability limitation of PoW mechanism. Ethereum transitioned to PoS and decommissioned PoW as part of the merge of Ethereum 1.0 and Ethereum 2.0 in September 2022. We will discuss Ethereum 1.0 and 2.0 in more details in the next chapter.
Forking
Earlier, we spoke about the temporary fork that occurs when two competing blocks are added to the blockchain. As shown in the following screenshot, this can continue until the majority of the nodes see the longest chain. Newer blocks will be appended to the longest chain. Blocks added to the shortleaf of the forked chain will be discarded, and those transactions will go back to the transaction pool and will be picked again for reprocessing. Eventually, the blockchain will comprise all conforming blocks, chained together using cryptographic hashes pointing to its ancestor:
Figure 1.17 – Forking in a blockchain
Just like software development, forking is a common practice in blockchain. Forking takes place when a blockchain bifurcates into two separate paths. The following events, intentionally or accidentally, can trigger a blockchain fork:
- New features are added, requiring a change in blockchain protocol, such as block size, mining algorithm, and consensus rules
- Hacking or software bugs
- A temporary fork occurs when competing for blocks with the same block height
A general forking scenario in a blockchain may look like the following screenshot:
Figure 1.18 – Competing blocks during forking
Depending on the nature of such events, the actions to fix the issues could be a hard fork or a soft fork or, in the case of a temporary fork, doing nothing and allowing the network to self-heal.
Hard fork
A hard fork happens when radical changes in the blockchain protocol are introduced and it makes historical blocks non-conformant with new protocols or rules. Some are planned. Developers and operators agree with protocol changes and upgrades to new software. Blocks following the old protocol will be rejected, and blocks following the new protocol will become the longest chain moving forward.
But, in some cases, this is controversial and heavily debated in the blockchain community, as was the case with the Bitcoin fork on 6 August 2010 or the fork between Ethereum and Ethereum Classic. In such contentious hard fork cases, as long as miners continue to maintain both the old and new software, the blocks created by the old and new software will diverge into separate blockchains.
The following screenshot illustrates both planned and contentious hard forks:
Figure 1.19 – Hard forks
During a contentious hard fork of blockchain, a new cryptocurrency will be created to fuel the new blockchain. The owner of the existing crypto-assets may stay in the current network or move to the new network. When moving to the new network, they will receive a proportional amount of new cryptocurrency in the new network. Hard forks often create pricing volatility. The conversion rate between the old and new fork may be determined by the market. It is important to know the context and details of a hard fork and understand the crypto-economic impacts of such a fork to both cryptocurrencies in order to take advantage of such sudden and drastic changes.
Once forked, nodes will start with separate paths moving forward. Nodes would need to decide which blockchain network they want to stay in. For example, Bitcoin Cash diverged from Bitcoin due to a disagreement within the Bitcoin community as to how to handle the scalability problem. As a result, Bitcoin Cash became its own chain and shares the transaction history from the genesis block up to the forking point. As of May 23, 2022, Bitcoin Cash’s market cap is around $3.67 billion, ranking twenty-fourth, versus Bitcoin’s $556 billion.
Soft fork
A soft fork, by contrast, is any change of rules that is backward-compatible between two versions of the software and the blocks. It goes both ways. In the soft fork case, existing historical blocks are still considered valid blocks by the new software. At the same time, the new blocks created through new software can still be recognized as valid ones by the old software. In the decentralized network, not all nodes upgrade their software at the same time. Nodes staying with an older version of the blockchain software continue creating new blocks using the older software. Nodes upgraded to the newer version of blockchain software will create new blocks using new software. Eventually, when the majority of the network hashing capacity upgrades to a newer version of the software, in theory more blocks will be created with the newer version and make it the longest chain. Nodes with older software can still create new blocks. Since it is not in the longest chain, as illustrated in the following screenshot, similar to the temporary fork case, these blocks will soon be overtaken by the new chain:
Figure 1.20 – Soft fork in progress
Where more nodes are stuck on the older version, as illustrated in the following screenshot, new blocks created from an older version of blockchain software may become longer and longer; it will take a while for the new software to be effective:
Figure 1.21 – Soft fork at the end
So far, you have learned how PoW and PoS work. We have analyzed the advantages and disadvantages of different consensus mechanisms. In the next section, we will help you understand what Bitcoin and cryptocurrency are and discuss how blockchain technology applies to Bitcoin.