Performing named entity recognition using spaCy
Named entity recognition (NER) is the task of parsing the names of places, people, organizations, and so on, out of text. This can be useful in many downstream tasks. For example, you could imagine a situation where you would like to sort an article set by the people that are mentioned in it, for example, when carrying out research about a certain person.
In this recipe, we will use NER to parse out named entities from article texts in the BBC dataset. We will load the package and the parsing engine and loop through the NER results.
Getting ready
In this recipe, we will use spaCy. To run it correctly, you will need to download a language model. We will download the small and large models. These models take up significant disk space:
python -m spacy download en_core_web_sm python -m spacy download en_core_web_lg
The notebook is located at https://github.com/PacktPublishing/Python-Natural-Language-Processing-Cookbook-Second...