The subtraction of character classes
Suppose we have to match characters that belong to one class but not to another in a composite character class pattern. There is no separate operator for the subtraction operation. Subtraction is performed by using the intersection operator, &&
, and a negated inner character class.
Note
A regular expression is usually more readable if we write the larger set in front and the one we want to subtract from it after the &&
operator.
For example, consider the following composite character class:
[0-9&&[^3-6]]
It will match the digits, 0
to 9
, except the digits, 3
to 6
. This character class can also be written as a union of two character classes:
[[0-2][7-9]]
We can also just use a simple character class, as follows:
[0-27-9]
In order to match all the English consonant uppercase letters, we can subtract five vowels from uppercase letters, such as in the following regex:
[A-Z&&[^AEIOU]]
We can also reverse the order of the two sets...