An introductory guide to spaCy
spaCy is a library for advanced NLP. The library, which is pretty fast to run, also comes with a range of useful tools and pretrained models that make NLP easier and more reliable. If you've installed Kaggle, you won't need to download spaCy, as it comes preinstalled with all the models.
To use spaCy locally, you will need to install the library and download its pretrained models separately.
To install the library, we simply need to run the following command:
$ pip install -U spacy $ python -m spacy download en
Note
Note: This chapter makes use of the English language models, but more are available. Most features are available in English, German, Spanish, Portuguese, French, Italian, and Dutch. Entity recognition is available for many more languages through the multi-language model.
The core of spaCy is made up of the Doc
and Vocab
classes. A Doc
instance contains one document, including its text, tokenized version, and recognized entities. The Vocab
class, meanwhile...