Exploring Text Encoding Techniques for Neural Networks
In Chapter 4, Building and Training a Feedforward Neural Network, you learned that feedforward networks – and all other neural networks as well – are trained on numbers and don't understand nominal values. In this chapter, we want to feed words and characters into neural networks. Therefore, we need to introduce some techniques to encode sequences of words or characters – that is, sequences of nominal values – into sequences of numbers or numerical vectors. In addition, in NLP applications with RNNs, it is mandatory that the order of words or characters in the sequence is retained throughout the text encoding procedure.
Let's have a look at some text encoding techniques before we dive into the NLP case studies.
Index Encoding
In Chapter 4, Building and Training a Feedforward Neural Network, you learned about index encoding for nominal values. The idea was to represent each nominal...