Having discussed some of the basics of text analysis, let's dive head first into our first Python package we'll be learning to use - spaCy [1].
spaCy describes itself as Industrial Strength Natural Language Processing – and it most certainly does its best to live up to this promise. Focused on getting things done rather than a more academic approach, spaCy ships with only one part-of-speech tagging algorithm and only one named-entity-recognizer (per language). What this also means is that the package is not bloated with unnecessary features.
We previously mentioned academic approach – what does this mean? A large number of the open-source packages in the natural language processing and machine learning are usually created or maintained by researchers and those working in academia. While they do end up working – the aim of the projects is not...