Training your own NER model with spaCy
In the previous recipe, we used the pretrained spaCy model to extract named entities. This NER model can suffice in many cases. There might be other times, however, when we would like to create a new one from scratch. In this recipe, we will train a new NER model to parse out the names of musicians and their works of art.
Getting ready
We will use the spaCy package to train a new NER model. You do not need any other packages other than spacy
. The data we are going to use is from https://github.com/deezer/music-ner-eacl2023. The data file is preloaded in the data folder (https://github.com/PacktPublishing/Python-Natural-Language-Processing-Cookbook-Second-Edition/blob/main/data/music_ner.csv) and you will need to download it from the book’s GitHub repository into the data
directory.
The notebook is located at https://github.com/PacktPublishing/Python-Natural-Language-Processing-Cookbook-Second-Edition/blob/main/Chapter05/5.5_training_own_spacy_model...