Extracting features from data
Feature extraction (feature engineering) is the process of transforming data into features that express the underlying information in a specific way for the target task. Data preprocessing applies generic techniques that are often necessary for most data analytics tasks. However, feature extraction requires you to exploit domain knowledge as it is specific to the task. In this section, we will introduce popular feature extraction techniques, including bag-of-words for text data, term frequency-inverse document frequency, converting color images into gray images, ordinal encoding, one-hot encoding, dimensionality reduction, and fuzzy match for comparing two strings.
Complete implementations of these examples can be found online at https://github.com/PacktPublishing/Production-Ready-Applied-Deep-Learning/tree/main/Chapter_2/data_preproessing.
First, we will start with the bag-of-words technique.
Converting text using bag-of-words
Bag-of-words...