An introduction to cryptography
One of the most important things in cryptography is to understand definitions and notations. I have never been a fan of definitions and notations, first of all, because I am the only one to use notations that I’ve invented. But I realize that it is very important, especially when we are talking about something related to mathematics, to agree among ourselves. Thus, in this section, I will introduce basic information and citations relating to cryptography.
We’ll start with a definition of an algorithm.
In mathematics and computer science, an algorithm is a finite sequence of well-defined computer-implementable instructions.
Here is an important question: what is a cipher?
A cipher is a system of any type able to transform plaintext (a message) into not-intelligible text (a ciphertext or cryptogram):
Figure 1.1: Encryption process
To get some utility from a cipher, we have to set up two operations: encryption and decryption. In simpler terms, we have to keep the message secret and safe for a certain period of time.
We define M as the set of all the messages and C as the set of all the cryptograms.
Encryption is an operation that transforms a generic message, m, into a cryptogram, c, applying a function, E:
Decryption is an operation that returns the message in cleartext, m, from the cryptogram, c, applying a function, D:
Mathematically, D(E(m))= m.
This means that the E and D functions are the inverse of each other, and the E function has to be injective. Injective means different M values have to correspond to different C values.
Note that it doesn’t matter whether I use capital letters or lowercase, such as (M) or (m); it’s inconsequential at the moment. For the moment, I have used round brackets indiscriminately, but later I will use square brackets to distinguish secret elements of a function from known ones, for which I will use square brackets. So, the secret message M will be written as [M], just like any other secret parameter. Here, just showing how the algorithms work is within our scope; we’ll leave their implementation to engineers.
There is another important notation that is key to encryption/decryption. To encrypt and decrypt a message, it is necessary to set up a key. In cryptography, a key is a parameter that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would produce no useful results.
We define K as the set of all the keys used to encrypt and decrypt M, and k as the single encryption or decryption key, also called the session key. However, these two ways to define a key (a set of keys is K and a single key is k) will always be used, specifying what kind of key it is (private or public).
Now that we understand the main concepts of cryptographic notation, it is time to explain the difference between private and public keys:
- In cryptography, a private or secret key (Kpr), denoted as [K] or [k], is an encryption/decryption parameter known only to one, both, or multiple parties in order to exchange secret messages.
- In cryptography, a public key (Kpu) or (K) is an encryption key known by everyone who wants to send a secret message or authenticate a user.
So, what is the main difference between private and public keys?
The difference is that a private key is used both to encrypt and/or decrypt a message, while a public key is used only to encrypt a message and verify the identity (digital signatures) of humans and computers. This is a substantial and very important issue because it determines the difference between symmetric and asymmetric encryption.
Let’s give a generic definition of these two methods of encryption:
- Symmetric encryption uses only one shared key to both encrypt and decrypt the message.
- Asymmetric encryption implements more parameters to generate a public key (to encrypt the message) and just one private key to decrypt the message.
As we will see later on, private keys are used in symmetric encryption to encrypt/decrypt the message with the same key and in asymmetric encryption in a general way for decryption, whereas public keys are used only in asymmetric encryption to encrypt the message and to perform digital signatures. You will see the function of these two types of keys later, but for now, keep in mind that a private key is used both in symmetric and asymmetric encryption, while a public key is used only for asymmetric encryption. Note that it’s not my intention to discuss academic definitions and notation, so please try to figure out the scope and the use of each element.
One of the main problems in cryptography is the transmission of the key or the key exchange. This problem resulted in strong diatribes in the community of mathematicians and cryptographers because it was very hard to determine how to transmit a key while avoiding physically exchanging it.
For example, if Alice and Bob wanted to exchange a key (before the advent of asymmetric encryption), the only trusted way to do that was to meet physically in one place. This condition caused a lot of problems with the massive adoption of telecommunication systems and the internet. The first problem was that internet communication relies on data exchange over unsafe channels. As you can easily understand, if Alice communicates with Bob through an insecure public communication channel, the private key has a severe possibility of being compromised, which is extremely dangerous for the security and privacy of communications.
For this reason, this question arises: if we use a symmetric cipher to protect our secret information, how can we securely exchange the secret key?
A simple answer is the following: we have to provide a secure channel of communication to exchange the key.
Someone could then reply: how do we provide a secure channel?
We will find the answer, or rather multiple answers, later on in this book. Even in tough military applications, such as the legendary red line between the leaders of the US and USSR during the Cold War, symmetric communication keys were used; nowadays, it is common to use asymmetric encryption to exchange a key. Once the key has been exchanged, the next communication session is combined with symmetric encryption to encrypt the messages transmitted.
For many reasons, asymmetric encryption is a good way to exchange a key and is good for authentication and digital signatures. Computationally, symmetric encryption is better because it can work with lower-bit-length keys, saving a lot of bandwidth and timing. So, in general, its algorithms work efficiently for security using keys of 256-512 bits compared to the 4,000+ bits of asymmetric RSA encryption, for example. I will explain in detail why and how that is possible later, during the analysis of the algorithms in asymmetric/symmetric encryption.
In this book, I will analyze many kinds of cryptographic techniques but, essentially, we can group all the algorithms into two big families: symmetric and asymmetric encryption.
We need some more definitions to understand cryptography:
- Plaintext: In cryptography, this indicates unencrypted text or everything that could be exposed in public. For example, (meet you tomorrow at 10 am) is plaintext.
- Ciphertext: In cryptography, this indicates the result of the text after having performed the encryption procedure. For example, meet you tomorrow at 10 am could become [x549559*ehebibcm3494] in ciphertext.
- As I mentioned before, I use different brackets to identify plaintext and ciphertext. In particular, these brackets (…) identify plaintext, while square brackets […] identify ciphertext. So, this is the secret message that was mentioned earlier: [x549559*ehebibcm3494].
Binary numbers, ASCII code, and notations
When we manipulate data with computers, it is common to use data as strings of 0- and 1-named bits. So, numbers can be converted into bits (base 2) rather than into base 10, like our numeric system. Let’s just have a look at how the conversion mechanism works. For example, the number 123 can be written in base 10 as:
Likewise, we can convert a base 10 number to a base 2 number. In this case, we use the example of the number 29:
Figure 1.2: Conversion of the number 29 into base 2 (bits)
The remainder of a division is very popular in cryptography because modular mathematics is based on the concept of remainders. We will go deeper into this topic in the next section, when I explain prime numbers and modular mathematics.
To transform letters into a binary system to be encoded by computers, the American Standards Association invented the ASCII code in 1960.
From the ASCII website, we have the following definition:
”ASCII stands for American Standard Code for Information Interchange. It’s a 7-bit character code where every single bit represents a unique character.”
The following is an example of an ASCII code table with the first 10 characters:
Figure 1.3: The first 10 characters and symbols expressed in ASCII code
In Chapter 4, Hash Functions and Digital Signatures, we will learn about hex and octal. The key detail at this moment is to observe the binary system.
Note that, in my implementations, made with the Wolfram Mathematica research software, I will often use the character 88 as X to denote the message number to encrypt. In ASCII code, the number 88 corresponds to the symbol X, as you can see in the following example:
You can go to the Appendix section at the end of the book to find all the notation used in this book, both for the algorithms and their implementation with Mathematica code.
Fermat’s Last Theorem, prime numbers, and modular mathematics
When we talk about cryptography, we have to always keep in mind that this subject is essentially related to mathematics and logic. Before I start explaining Fermat’s Last Theorem, I want to introduce some basic notation that will be used throughout the book to prevent confusion and for a better understanding of the topic. It’s important to know that some symbols, such as =, ≡ (equivalent), and := (this last one you can find in Mathematica to compute =), are just a way to tell you that two elements correspond to each other in equal measure; it doesn’t matter whether it is in a finite field (don’t worry, you will become familiar with this terminology), computer science, or in regular algebra. Mathematicians may be horrified by this, but I trust your intelligence and that you will look for the substance and not for uniformity.
Another symbol, ≈ (approximate), can be used to denote similar approximative elements.
When needed, you will also encounter the ^ (exponent) symbol as a classical way to express exponentiation: a^x (a elevated to x).
The ≠ symbol, as you should remember from high school, means not equal or unequal, which is represented in modular mathematics as the meaning of ≢, that is, not equivalent.
However, you will always get an explanation of the equations, so if you are not very familiar with mathematical and logical notation, you can rely on the descriptions. Anyway, I will explain each case as we come across a new notation.
Finally, as you should know, a prime number is an integer that can only be divided by itself and 1, for example, 2, 3, 5, 7….23….67……p.
Prime numbers are the cornerstones of mathematics because all other composite numbers originate from them. You will see that using a composite number instead of a prime number in cryptography could cause a big lack of security or an attack (see Chapter 5, Zero-Knowledge Protocols).
Now, let’s see what Fermat’s Last Theorem is, where it is applied, and why it is useful for us.
Fermat’s Last Theorem is one of the best and most beautiful theorems of classical mathematics strictly related to prime numbers, which are foundational to cryptography. According to Wikipedia:
”In number theory, Fermat’s Last Theorem (sometimes called Fermat’s conjecture, especially in older texts) states that no three positive integers a, b, and c satisfy the equation an + bn = cn for any integer value of n greater than 2. The cases n = 1 and n = 2 have been known since antiquity to have infinitely many solutions.”
In other words, it tells us that given the following equation, for any exponent, n≥3, there is no integer, a, b, or c, that verifies the sum:
Why is this theorem so important to us? This is something that you will come to appreciate over the course of this book as we encounter more algorithms. Essentially, it’s because Fermat’s Last Theorem is strictly related to prime numbers. In fact, given the properties of primes, in order to demonstrate Fermat’s Last Theorem, it’s sufficient to demonstrate the following:
Here, p is any prime number greater than 2.
Fermat himself noted in a paper that he had a beautiful demonstration of the theorem that was too large to fit in the margin of his notes, but it has never been found.
Wiles’ proof was more than 200 pages long, reduced to about 130 in the last version, and is immensely difficult to understand. The proof is based on elliptic curves: these curves take a particular form when they are represented in a modular form. Wiles arrived at his conclusion after 7 years and explained his proof at a mathematicians’ congress in 1994. You will discover part of the logic used in Wiles’ proof when you read Chapter 7, Elliptic Curves. Right now, we just assume that, to demonstrate Fermat’s Last Theorem, Wiles needed to rely on the Taniyama-Shimura conjecture, which states that elliptic curves over the field of rational numbers are related to modular forms. Again, don’t worry if this seems too complicated; eventually, as we progress, it will start making sense.
We will deeply analyze Fermat’s Last Theorem in Chapter 6, New Inventions in Cryptography and Logical Attacks, when I introduce the MB09 algorithm based on Fermat’s Last Theorem, among other innovative algorithms in public/private keys. Moreover, we will analyze the elliptic curves applied in cryptography in Chapter 7, Elliptic Curves.
Fermat was obsessed with prime numbers, just like many other mathematicians; he searched for prime numbers and their properties throughout his life. He tried to attempt to find a general formula to represent all the primes in the universe, but unluckily, Fermat, just like many other mathematicians, only managed to construct a formula for some of them. The following is Fermat’s prime numbers formula where n is some positive integer:
If we substitute n with integers, we can obtain some prime numbers:
- n = 1, p = 5
- n = 2, p = 17
- n = 3, p = 65 (not prime)
- n = 4, p = 257
Probably more famous but very similar is the Mersenne prime numbers formula, again where n is some positive integer:
This yields the following results:
- n = 1, p=1
- n = 2, p=3
- n = 3, p=7
- n = 4, p=15 (not prime)
- n = 5, p=31
Despite countless attempts to find a formula that exclusively represents all prime numbers, nobody has reached this goal as of yet.
The Great Internet Mersenne Prime Search (GIMPS) is a research project that aims to discover the newest and biggest prime numbers with Mersenne’s formula.
If you explore the GIMPS website, you can discover the following:
- All exponents below 53,423,543 have been tested and verified.
- All exponents below 92,111,363 have been tested at least once.
- The 51st Mersenne prime has been found!
- December 21, 2018 — The Great Internet Mersenne Prime Search (GIMPS) discovered the largest known prime number, 282,589,933-1, having 24,862,048 digits. A computer volunteered by Patrick Laroche from Ocala, Florida made the find on December 7, 2018. The new prime number, also known as M82589933, is calculated by multiplying together 82,589,933 twos and then subtracting one. It is more than one and a half million digits larger than the previous record prime number.
Besides that, GIMPS is probably the first decentralized example of how to split CPU and computer power to reach a common goal. But why all this interest in finding big primes?
There are at least three answers to this question: the passion for pure research, the money, because there are several prizes for those who find big primes, and finally, because prime numbers are important for cryptography, just like oxygen is for humans. This is also the reason why there is prize money for discovering big prime numbers.
You will understand that most algorithms of the next generation work with prime numbers. But how do you discover whether a number is prime?
In mathematics, there is a substantial computation difference between the operation of multiplication and division. Division is a lot more computationally expensive than multiplication. This means, for instance, that if I compute 2x, where x is a huge number, it is easy to operate the power elevation but is extremely difficult to find the divisors of that number.
Because of this, mathematicians such as Fermat struggled to find algorithms to make this computation easier.
In the field of prime numbers, Fermat produced another very interesting theorem, known as Fermat’s Little Theorem. Before explaining this theorem, it is time to understand what modular arithmetics is and how to compute with it.
The simplest way to learn modular arithmetics is to think of a clock. When we say: “Hey, we can meet at 1 p.m.,” actually, we calculate that 1 is the first hour after 12 (the clock finishes its circular wrap).
So, we can say that we are unconsciously calculating in modulus 12, written by the notation (mod 12), where integers wrap around when reaching a certain value (in this case, 12), called the modulus.
Technically, the result of a calculation with a modulus consists of the remainder of the division between the number and the modulus.
For example, in our clock, we have the following:
This means that 13 is congruent to 1 in modulus 12. You can consider congruent to mean equal. In other words, we can say that the remainder of the division of 13:12 is 1:
Figure 1.4: Example of modular arithmetics with a clock
Fermat’s Little Theorem states that if (p) is a prime number, then for any integer (a) elevated to the prime number (p) we find (a) as the result of the following equation:
For example, if a = 2 and p = 3, then 23 = 2 (mod 3). In other terms, we find the rest of the division 8 : 3 = 2 with remainder 2.
Fermat’s Little Theorem is the basis of the Fermat primality test and is one of the fundamental parts of elementary number theory.
Fermat’s Little Theorem states that a number, p, is probably prime in the following instance:
Now that we have refreshed our knowledge on the operations of bit conversion, we have seen what ASCII code looks like, and we have explored the basic notation of mathematics and logic, we can start our journey into cryptography.