Improved factorization machines
Many predictive tasks for web applications need to model categorical variables, such as user IDs, and demographic information, such as genders and occupations. To apply standard ML techniques, these categorical predictors need to be converted to a set of binary features via one-hot encoding (or any other technique). This makes the resultant feature vector highly sparse. To learn effectively from such sparse data, it is important to consider the interactions between features.
In the previous section, we saw that FM could be applied to model second-order feature interactions effectively. However, FM models feature interactions in a linear way, which is insufficient if you want to capture the non-linear and inherently complex structure of real-world data.
Xiangnan He and Jun Xiao et al. have proposed several research initiatives, such as Neural Factorization Machine (NFM) and Attentional Factorization Machine (AFM), in an attempt to overcome this limitation.
For...