Understanding DR
The second category of UL that we will discuss is known as DR. As the full name states, these are simply methods used to reduce the number of dimensions in a given dataset. Take, for example, a highly featured dataset with 100 or so columns—DR algorithms can be used to help reduce the number of columns down to perhaps 5 while preserving the value that each of those original 100 columns contains. You can think of DR as the process of condensing a dataset in a horizontal fashion. The resulting columns can generally be divided into two types: new features, in the sense that a new column with new numerical values was generated in a process known as Feature Engineering (FE), or old features, in the sense that only the most useful columns were preserved in a process known as feature selection. Over the course of the following section and within the confines of UL, we will be focusing more on the aspect of FE as we create new features representing reduced versions...