NLP for PyTorch
Now that we have learned how to build neural networks, we will see how it is possible to build models for NLP using PyTorch. In this example, we will create a basic bag-of-words classifier in order to classify the language of a given sentence.
Setting up the classifier
For this example, we'll take a selection of sentences in Spanish and English:
- First, we split each sentence into a list of words and take the language of each sentence as a label. We take a section of sentences to train our model on and keep a small section to one side as our test set. We do this so that we can evaluate the performance of our model after it has been trained:
("This is my favourite chapter".lower().split(),\ "English"), ("Estoy en la biblioteca".lower().split(), "Spanish")
Note that we also transform each word into lowercase, which stops words being double counted in our bag-of-words. If we have the word
book
and the wordBook
...