Feature engineering – the basics
Feature engineering is the process of transforming raw data into vectors of numbers that can be used in machine learning algorithms. This process is structured and requires us to first select which feature extraction mechanism we need to use – which depends on the type of the task – and then configure the chosen feature extraction mechanism. When the chosen algorithm is configured, we can use it to transform the raw input data into a matrix of features – we call this process feature extraction. Sometimes, the data needs to be processed before (or after) the feature extraction, for example, by merging fields or removing noise. This process is called data wrangling.
The number of feature extraction mechanisms is large, and we cannot cover all of them. Not that we need to either. What we need to understand, however, is how the choice of feature extraction mechanism influences the properties of the data. We’ll dive...