lex/flex Tutorial
warning: work in progress.
Structure of a Lex file
files are divided into three sections, separated by lines that contain only two percent signs, as follows:
Definition section %% Rules section %% C code section
- The definition section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
- The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
- The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file linked in at compile time.
Example of a Lex file
The following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (integers) in the input, and simply prints them out.
/*** Definition section ***/ %{ /* C code to be copied verbatim */ #include <stdio.h> %} /* This tells flex to read only one input file */ %option noyywrap %% /*** Rules section ***/ /* [0-9]+ matches a string of one or more digits */ [0-9]+ { /* yytext is a string containing the matched text. */ printf("Saw an integer: %s\n", yytext); } .|\n { /* Ignore all other characters. */ } %% /*** C Code section ***/ int main(void) { /* Call the lexer, then quit. */ yylex(); return 0; }
If this input is given to flex, it will be converted into a C file, “lex.yy.c”. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input:
abc123z.!&*2gj6
the program will print:
Saw an integer: 123 Saw an integer: 2 Saw an integer: 6
Using Lex with other programming tools
Using Lex with parser generators
Lex and parser generators, such as Yacc or Bison, are commonly used together. Parser generators use a formal grammar to parse an input stream, something which Lex cannot do using simple regular expressions.
It is typically preferable to have a (Yacc-generated, say) parser be fed a token-stream as input, rather than having it consume the input character-stream directly. Lex is often used to produce such a token-stream.
Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer. Lex and make
make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.[6]
above is from or based on Lex (software) .
Lex Manual and Tutorial
in emacs, Ctrl+h i to see the full detailed manual that is also tutorial.