Toward feature engineering
In this chapter, we explored methods for visualizing data. We learned how to create diagrams and identify dependencies in the data. We also learned how we can use dimensionality reduction techniques to plot multidimensional data on a two dimensional diagram.
In the next few chapters, we’ll dive into feature engineering different types of data. Sometimes, it is easy to mix feature engineering with data extraction. In practice, it is not that difficult to tell one from the other.
Extracted data is data that has been collected by applying some sort of measurement instrument. Raw text or images are good examples of this kind of data. Extracted data is close to the domain where the data comes from – or how it is measured.
Features describe the data based on the analysis that we want to perform – they are closer to what we want to do with the data. It is closer to what we want to achieve and which form of machine learning analysis...