In this section, we will focus all our efforts on preparing data with binary inputs or targets. By binary, of course, we mean values that can be represented as either 0 or 1. Notice the emphasis on the words represented as. The reason is that a column may contain data that is not necessarily a 0 or a 1, but could be interpreted as or represented by a 0 or a 1.
Consider the following fragment of a dataset:
x1 |
x2 |
... |
y |
0 |
5 |
... |
a |
1 |
7 |
... |
a |
1 |
5 |
... |
b |
0 |
7 |
... |
b |
In this short dataset example with only four rows, the column x1 has values that are clearly binary and are either 0 or a 1. However, x2, at first glance, may not be perceived as binary, but if you pay close attention, the only values in that column are either 5 or 7. This means that the data can be correctly and uniquely mapped to a set of two values. Therefore, we could map 5 to 0, and 7 to 1, or vice versa; it does not really matter.
A similar...