Combining spaCy models and matchers
In this section, we'll go through some recipes that will guide you through the entity extraction types you'll encounter in your NLP career. All the examples are ready-to-use and real-world recipes. Let's start with number-formatted entities.
Extracting IBAN and account numbers
IBAN and account numbers are two important entity types that occur in finance and banking frequently. We'll learn how to parse them out.
An IBAN is an international number format for bank account numbers. It has the format of a two-digit country code followed by numbers. Here are some IBANs from different countries:
How can we create a pattern for an IBAN? Obviously, in all cases, we start with two capital letters, followed by two digits. Then any number of digits can follow. We can express the country code and the next two digits as follows...