Until now, we have used a bag-of-words vector, optionally with some weighting scheme such as tf-idf to represent the text in a document. Another recent class of models that has become popular is related to representing individual words as vectors.
These are generally based in some way on the co-occurrence statistics between the words in a corpus. Once the vector representation is computed, we can use these vectors in ways similar to how we might use tf-idf vectors (such as using them as features for other machine learning models). One such common use case is computing the similarity between two words with respect to their meanings, based on their vector representations.
Word2Vec refers to a specific implementation of one of these models, often referred to as distributed vector representations. The MLlib model uses a skip-gram model, which seeks to learn vector representations that take into account...