Defining "safe"
An important concept when we deal with cryptography is to define what "safe" means in this context.
As a rule of thumb, possibly every cryptographic algorithm can be broken, given a sufficient amount of computing power and time.
For simplicity, we'll focus the next examples on data encryption, although the same would apply to all other classes of cryptographic operations.
The goal of using cryptography is to make sure that the effort (computing power multiplied by time) is too big for an attacker to attempt to break your encryption. Assuming that the algorithms you use do not have flaws in their design or backdoors that can be exploited, the only way attackers can crack your encryption is by doing a brute-force attack.
A brute-force attack works by trying every single combination possible until you find the right one. For example, if you were trying to break a padlock with 3 digits, each from 0 to 9, you'd have 1,000 different combinations (000, 001, 002… until 999), and the correct one would be one of those. On average, it would take you 500 attempts before you could find the right one (the number of possible permutations divided by 2). If every attempt takes you 3 seconds to try, then you can expect to be done in 1,500 seconds on average, or 25 minutes.
In theory, brute force can break any encryption. The goal is to use encryption that is strong enough to make it impractical to break it using a brute-force attack.
For example, AES-128 (a symmetric cipher) uses a 128-bit key, which means that there are 2128 possible combinations, or 340,282,366,920,938,463,463,374,607,431,768,211,456;that is, approximately 3.4 x 1038. That is a very large number that ought to be put into perspective.
Getting Perspective
One of the largest computing grids in the world today, if perhaps not the largest, is the network of all the Bitcoin mining grids. In 2021, the Bitcoin network reached an estimated peak hash rate of 180 million terahashes per second, which would translate to being able to compute about 292 hashes per year. This is a metric that is commonly used to indicate how much compute power is being used by the network.
Imagine a hypothetical scenario in which all Bitcoin miners agreed to get together and convert the entire grid to brute-force a 128-bit key. If they managed to attempt 292 keys per year, it would still take them an average of 235 years, or over 1010 years, to complete the attack (2128 possible keys, divided by 292 keys attempted per year, divided by 2 to get the average time). That's roughly the same as the age of the universe, which is estimated to be around 14 billion years old, or about 1010.
Of course, computing power increases constantly, so the time it will take to break cryptography in the future will be less.
Quantum computing will also make it easier to break certain kinds of cryptography, although those systems are still experimental today and not yet powerful enough for most practical applications (nevertheless, cryptographers are already preparing for that future today, by designing stronger "post-quantum" algorithms).
That said, our goal should always be to choose algorithms that are guaranteed to protect our data for at least as long as it's necessary. For example, let's say you encrypt a document containing the password of your online banking account today; if technological innovation allowed an attacker to crack it in "only" 100 years from now, it might not matter to you at that point, as you'll most likely be dead by then.
Given the aforementioned context, an algorithm should be considered "broken" when it's possible to crack it in a way that is significantly faster than using a brute-force attack.
Most cryptographic algorithms that we use today are safe in this sense because the only possible attacks against them are brute-force ones, which as we've seen are just not practical.
How Cryptography Is Broken – the Example of Frequency Analysis
At the beginning of this chapter, we looked at some historical ciphers and explained that they were broken. Substitution ciphers are among the oldest and simplest ones, and they rely on replacing every letter of the alphabet with another one. For example, in the Caesar cipher, letters are shifted by three, so A is replaced with D, B with E, and so on. While this may have been effective to protect messages from the illiterate enemy soldiers of the era, it looks trivial today.
Substitution ciphers that rely on the shifting of letters can be broken in a fairly simple way with brute-force attacks, even manually (assuming, of course, that the attacker knows that the cipher works by shifting letters). With only 26 letters in the English alphabet, it takes at most 25 attempts to find the number by which characters are shifted.
A more complex substitution cipher could instead create a new alphabet in which letters are randomly shuffled. So, the letter A may be replaced with X, B with C, and D with P. With the English alphabet, there are 26! (26 factorial) possible combinations, or about 1026, offering a decent amount of protection against brute-force attacks, even against an adversary that can use modern computers.
However, through cryptanalysis (the science of studying cryptographic systems looking for weaknesses), even those more "advanced" substitution ciphers have been broken, through a technique called frequency analysis. This works by looking at the frequency of letters (and pairs of letters) in the encrypted text and comparing that with the average frequency of letters in English writing. For example, "E" is the most frequently used letter in English text, so whatever letter is the most used in the encrypted text will likely correspond to "E."
While cryptanalysis is certainly not something we'll focus on in this book, this mention will hopefully provide an insight into one of the (many) ways algorithms can be broken.
With an understanding of how encryption helps protect data and a contextualization of what it means for data to be "safe," we now need to look at the multiple ways that data can be encrypted.