Summary
In this chapter, we described how to prepare a dataset for data analytics tasks. The first key point was how to achieve environment virtualization using Anaconda and Docker, along with Python package management using pip
.
The data preparation process can be broken down into four steps: data collection, data cleaning, data preprocessing, and feature extraction. First, we have introduced various tools available for data collection that support different data types. Once the data has been collected, it is cleaned and preprocessed so that it can be transformed into a generic form. Depending on the target task, we often apply various feature extraction techniques that are task-specific. In addition, we have introduced many tools for data visualization that can help you understand the characteristics of the prepared data.
Now that we have learned how to prepare our data for analytics tasks, in the next chapter, we will explain DL model development. We will introduce the basic...