Using UFlex and JFlex
Writing a scanner by hand is an interesting task for a programmer who wants to know exactly how everything works, but it will slow down the development of your language and make it more difficult to maintain the code afterward.
Good news, everyone! A family of tools descended from UNIX, known as lex
, takes regular expressions and generates a scanner function for you. Lex-compatible tools are available for most popular programming languages. For C/C++, the most widely used lex-compatible tool is Flex, hosted at https://github.com/westes/flex/. For Unicon, we use UFlex, while for Java, you can use JFlex. These tools may have various custom extensions, but to the extent that they are compatible with UNIX lex
, we can present them together as one language for writing scanners. This book's examples have been crafted carefully so that we can even use the same lex
input for both the Unicon and Java implementation!
The input files for lex
are often called (lex...