Data Preparation
The first step in the development of any deep learning model – after gathering the data, of course – should be preparation of the data. This is crucial if we wish to understand the data at hand to outline the scope of the project correctly.
Many data scientists fail to do so, which results in models that perform poorly, and even models that are useless as they do not answer the data problem to begin with.
The process of preparing the data can be divided into three main tasks:
- Understanding the data and dealing with any potential issues
- Rescaling the features to make sure no bias is introduced by mistake
- Splitting the data to be able to measure performance accurately
All three tasks will be further explained in the next section.
Note
All of the tasks we explained previously are pretty much the same when applying any machine learning algorithm, considering that they refer to the techniques that are required to prepare...