Regular expressions
Regular expression notation is the most widely used way to describe patterns of symbols within files. They are formulated from very simple rules that are easy to understand. The set of symbols over which a set of regular expressions is written is called the alphabet. For simplicity, in this book, the values 0-255 that can be held in one byte will be our alphabet for reading source code.In some sets of input symbols, regular expressions are patterns that describe sets of strings using the members of the input symbol set and a few regular expression operators. Since they are a notation for sets, terms such as member, union, or intersection apply when talking about the sets of strings that regular expressions can match. We will look at the rules for building regular expressions in this section, followed by examples.
Regular expression rules
Over the years many different tools have used regular expressions, featuring many non-standard extensions to the notation. This...