Introduction to CBOW
The neural network of CBOW is shown in Figure 7.7. It looks like the mirror image of SG. The input layer consists of words that are adjacent to the target words. Again, we are interested in the weights in the hidden layer. They will be the word embeddings.
Figure 7.7 – The structure of a CBOW model
The word pairs of the input words and the output words become the pairs as shown in Figure 7.8. They are just the reverse of the word pairs in Figure 7.5. The structure of the neural network is reversed too. Between the hidden layer and the output layer is a 300 x 10,000 weight matrix. This weight matrix is what we are interested in because it has the vector encodings of all the unique words. If we inspect carefully, we will see most of the input nodes are zeros; the weights coming from the non-zero input nodes are the ones contributing to the hidden layer. The ith row in the weight matrix is the weight for the ith word.