Understanding cryptography
Cryptography is the science of converting plain, simple text into secret, hidden, and meaningful text, and vice versa. It also helps in transmitting and storing data that cannot be easily deciphered using owned keys.
There are two types of cryptography in computing:
- Symmetric cryptography: This refers to the process of using a single key for both encryption and decryption. It means the same key should be available for multiple people if they want to exchange messages using this form of cryptography.
- Asymmetric cryptography: This refers to the process of using two keys for encryption and decryption. Any key can be used to encrypt and decrypt. Messages encrypted with a public key can be decrypted using a private key, and messages encrypted by a private key can be decrypted using a public key. Let's understand this with the help of an example. Tom uses Alice's public key to encrypt messages and sends it to Alice. Alice can use her private key to decrypt the message and extract its content. Messages encrypted with Alice's public key can only be decrypted by Alice, as only she holds her private key and no one else. This is the general use case of asymmetric keys. There is another use that we will see in the Digital signatures section.
Hashing
Hashing is the process of transforming any input data into fixed-length random character data, and it is not possible to regenerate or identify the original data from the resultant hash. Hashes are also known as fingerprints of input data. It is next to impossible to derive input data based on its hash value. Hashing ensures that even a slight change in input data will completely change the output data, and no one can ascertain the change in the original data.
Another important property of hashing is that no matter the size of input string data, the length of its output is always fixed. For example, using the SHA-256 hashing algorithm and function with any length of input will always generate 256-bit output data. This can especially become useful when large amounts of data can be stored as 256-bit output data. Ethereum uses the hashing technique quite extensively. It hashes every transaction, hashes the hash of two transactions at a time, and ultimately generates a single root transaction hash for every transaction within a block.
Another important property of hashing is that it is not mathematically feasible to identify two different input strings that will output the same hash. Similarly, it is not possible to find the input computationally and mathematically from the hash itself.
Ethereum uses Keccak256
as its hashing algorithm. The following screenshot shows an example of hashing. The Ritesh Modi
input generates a hash, as shown in the following screenshot:
Even a small modification of input generates a completely different hash, as shown in the following screenshot:
Digital signatures
Earlier, we discussed cryptography using asymmetric keys. One of the important uses for asymmetric keys is in the creation and verification of a digital signature. Digital signatures are very similar to a signature done by an individual on a piece of paper. Similar to a paper signature, a digital signature helps in identifying an individual. It also helps in ensuring that messages are not tampered with while in transit. Let's understand digital signatures with the help of an example.
Alice wants to send a message to Tom. How can Tom identify and ensure that the message has come from Alice only and that the message has not been changed or tampered with in transit? Instead of sending a raw message/transaction, Alice creates a hash of the entire payload and encrypts the hash with her private key. She appends the resultant digital signature to the hash and transmits it to Tom. When the transaction reaches Tom, he extracts the digital signature and decrypts it using Alice's public key to find the original hash. He also extracts the original hash from the rest of the message and compares both the hashes. If the hashes match, it means that it actually originated from Alice and that it has not been tampered with.
Digital signatures are used to sign transaction data by the owner of the asset or cryptocurrency, such as Ether. With a basic understanding of cryptography, it's time to introduce Ethereum and blockchain at a high level.
Reviewing blockchain and Ethereum architecture
Blockchain is an architecture comprising multiple components, and what makes blockchain unique is the way these components function and interact with each other. Ethereum allows you to extend its functionality with the help of smart contracts. (Smart contracts will be addressed in detail throughout this book.)
Some of the important Ethereum components are the Ethereum Virtual Machine (EVM), miner, block, transaction, consensus algorithm, account, smart contract, mining, Ether, and gas. We are going to discuss each of these components in this chapter.
A blockchain network consists of multiple nodes belonging to miners and some nodes that do not mine but help in the execution of smart contracts and transactions. These are known as EVMs. Each node is connected to another node on the network. These nodes use a peer-to-peer protocol to talk to each other. They, by default, use port 30303
to talk among themselves.
Each miner maintains an instance of a ledger. A ledger contains all blocks in the chain. With multiple miners, it is quite possible that each miner's ledger instance might have different blocks to another. The miners synchronize their blocks on an ongoing basis to ensure that every miner's ledger instance is the same as the other. Details about ledgers, blocks, and transactions are discussed in detail in subsequent sections in this chapter.
The EVM executes smart contracts and helps bring about changes to the global state. Smart contracts help in extending Ethereum by writing custom business functionality into it. These smart contracts can be executed as part of a transaction, and it follows the process of mining as discussed earlier.
A person with an account on a network can send a message for the transfer of Ether from their account to another or can send a message to invoke a function within a contract. Ethereum does not distinguish them as far as transactions are considered. The transaction must be digitally signed with an account holder's private key. This is to ensure that the identity of the sender can be established while verifying the transaction and changing the balances of multiple accounts. Let's take a look at the components of Ethereum in the following diagram:
The previous diagram illustrates some of the important components in Ethereum. The externally owned accounts are responsible for initiating transactions on Ethereum. The transactions that are executed within the Ethereum nodes are finally written as blocks on the blockchain. These blocks have header sections that help in chaining the blocks.
Relationship between blocks
In blockchain and Ethereum, every block is related to another block. There is a parent-child relationship between two blocks. There can be only one child to a parent and a child can have a single parent. This helps in forming a chain in blockchain, as shown. Blocks will be explained in a later section in this chapter:
In this diagram, we can see three blocks apart from the Genesis Block – Block 1, Block 2, and Block 3. Block 1 is the parent of Block 2, and Block 2 is the parent of Block 3. The relationship is established by storing the parent block's hash in a child's block header. Block 2 stores the hash of Block 1 in its header and Block 3 stores the hash of Block 2 in its header. So, the question arises – who is the parent of the first block? Ethereum has a concept of the genesis block, also known as the first block. This block is created automatically when the chain is first initiated. You can say that a chain is initiated with the first block, the genesis block, and the formation of this block is driven through the genesis.json
file.
The next chapter will show you how to use the genesis.json
file to create the first block while initializing the blockchain.
How transactions and blocks are related to each other
Now that we know that blocks are related to each other, you will be interested in knowing how transactions are related to blocks. Ethereum stores transactions within blocks. Each block has an upper gas limit, and each transaction needs a certain amount of gas to be consumed as part of its execution. The cumulative gas from all transactions that are not yet written in a ledger cannot surpass the block gas limit. This ensures that all transactions do not get stored within a single block. As soon as the gas limit is reached, other transactions are removed from the block and mining begins thereafter. The gas concept will be covered in a subsequent section in this chapter. This section should be revisited after reading about gas.
The transactions are hashed and stored in the block. The hashes of two transactions are taken and hashed further to generate another hash. This process eventually provides a single hash from all transactions stored within the block. This hash is known as the transaction Merkle root hash and is stored in a block's header.
A change in any transaction will result in a change in its hash and, eventually, a change in the root transaction hash. It will have a cumulative effect because the hash of the block will change, and the child block has to change its hash because it stores its parent hash. This helps in making transactions immutable. This is also shown in the following diagram:
Blocks and transactions are core to blockchain, but the question remains about the process of adding them to the chain. A generated block should be agreed upon and acceptable to all the nodes within a network. The process of coming to an agreement on a block and subsequently either adding it to the chain or rejecting it is based on the process known as consensus.