After understanding the mathematical details of how skip-gram models work, we are going to implement skip-gram, which encodes words into real-valued vectors that have certain properties (hence the name Word2Vec). By implementing this architecture, you will get a clue of how the process of learning another representation works.
Text is the main input for a lot of natural language processing applications such as machine translation, sentiment analysis, and text to speech systems. So, learning a real-valued representation for the text will help us use different deep learning techniques for these tasks.
In the early chapters of this book, we introduced something called one-hot encoding, which produces a vector of zeros except for the index of the word that this vector represents. So, you may wonder why we are not using it here. This method is very...