Regular expressions
Regular expressions are the most widely used notations for describing patterns of symbols within files. They are formulated from very simple rules that are easy to understand. The set of symbols over which a set of regular expressions are written is called the alphabet. For simplicity, in this book, the values 0-255 that can be held in one byte will be our alphabet for reading source code.
In some sets of input symbols, regular expressions are patterns that describe sets of strings using the members of the input symbol set and a few regular expression operators. Since they are a notation for sets, terminology such as member, union, or intersection applies when talking about the sets of strings that regular expressions can match. We will look at the rules for building regular expressions in this section, followed by examples.
Regular expression rules
This book will show only those operators that are needed for examples. This will be a practical superset...