1.1 Data
If you think of all the digital data and information stored in the cloud, on the web, on your phone, on your laptop, in hard and solid-state drives, and in every computer transaction, consideration comes down to just two things: 0 and 1.
These are bits, and with bits we can represent everything else in what we call “classical computers.” These systems date back to ideas from the 1940s. There is an additional concept for quantum computers, that of a qubit or “quantum bit.” The qubit extends the bit and is manipulated in quantum circuits and gates. Before we get to qubits, however, let’s consider the bit more closely.
First, we can interpret 0 as false and 1 as true. Thinking in terms of data, what would you say if I asked you the question, “Do you like Punk Rock?” We can store your response as 0 or 1 and then use it in further processing, such as making music recommendations to you. When we talk about true and false, we call them Booleans instead of bits.
Second, we can treat 0 and 1 as the numbers 0 and 1. While that’s nice, if the maximum number we can talk about is 1, we can’t do any significant computation. So, we string together more bits to make larger numbers. The binary numbers 00, 01, 10, and 11 are the same as 0, 1, 2, and 3 when we use a decimal representation. Using even more bits, we represent 72 decimal as 1001000 binary and 83,694 decimal as 10100011011101110 binary.
decimal: 72 = 7 × 101 + 2 × 100
binary: 1001000 = 1 × 26 + 0 × 25 + 0 × 24 + 1 × 23
+ 0 × 22 + 0 × 21 + 0 × 20
Exercise 1.1
How would you represent 245 decimal in binary? What is the decimal representation of 1111 binary?
With slightly more sophisticated representations, we can store and use negative numbers. We can also create numbers with decimal points, also known as floating-point numbers. [FPA] We use floating-point numbers to represent or approximate real numbers. Whatever programming language we use, we must have a convenient way to work with all these kinds of numbers.
When we think of information more generally, we don’t just consider numbers. There are words, sentences, names, and other textual data. In the same way that we can encode numbers using bits, we create characters for text. Using the Extended ASCII standard, for example, we can create my nickname, “Bob”: [EAS]
01000010 → B
01101111 → o
01100010 → b
Each run of zeros and ones on the left-hand side has 8 bits. This is a byte. One thousand (103) bytes is a kilobyte, one million (106) is a megabyte, and one billion (109) is a gigabyte.
If we limit ourselves to a byte to represent characters, we can only represent 256 = 28 of them. If you look at the symbols on a keyboard, then imagine other alphabets, letters with accents and umlauts and other marks, mathematical symbols, and emojis, you will count well more than 256. In programming, we use the Unicode standard to represent many sets of characters and ways of encoding them using multiple bytes. [UNI]
Exercise 1.2
How can you create 256 different characters if you use 8 bits? How many could you form if you used 7 or 10?
When we put characters next to each other, we get strings. Thus "abcd"
is a string of length four. I’ve used the double quotes to delimit the beginning and end of the
string. They are not part of the string itself. Some languages treat characters as special
objects unto themselves, while others consider them to be strings of length one.
This is a good start for our programming needs: we have bits and Booleans, numbers, characters, and strings of multiple characters to form text. Now we need to do something with these kinds of data.