There is no point in running Word2Vec on an iOS device: in the app, we need only the vectors it generates. For running Word2Vec, we will use the Python NLP package gensim. This library is popular for topic modeling and contains a fast Word2Vec implementation with a nice API. We don't want to load large corpuses of text on a mobile phone and don't want to train Word2vec on the iOS device, so we will learn a vector representation using the Gensim Python library. Then, we will do some preprocessing (remove everything except nouns) and plug this database into our iOS application:
In [39]: import gensim In [40]: def trim_rule(word, count, min_count): if word not in words_to_keep or word in stop_words: return gensim.utils.RULE_DISCARD else: return gensim.utils.RULE_DEFAULT In [41]: model = gensim.models.Word2Vec(sentences_to_train_on...