Representing words with context-dependent vectors
Word2Vec’s word vectors are context-independent in that a word always has the same vector no matter what context it occurs in. However, in fact, the meanings of words are strongly affected by nearby words. For example, the meanings of the word film in We enjoyed the film and the table was covered with a thin film of dust are quite different. To capture these contextual differences in meanings, we would like to have a way to have different vector representations of these words that reflect the differences in meanings that result from the different contexts. This research direction has been extensively explored in the last few years, starting with the BERT (Bidirectional Encoder Representations from Transformers) system (https://aclanthology.org/N19-1423/ (Devlin et al., NAACL 2019)).
This approach has resulted in great improvements in NLP technology, which we will want to discuss in depth. For that reason, we will postpone...