/
Lexical filters

Lexical filters

ANTLR has a lexical filter mode that lets you sift through an input file looking for certain grammatical structures. The rules are prioritized in the order specified in case an input construct matches more than a single rule, with the first rule having the highest priority. The filter proceeds character-by-character looking for a match among the rules. If no match, consume that char and try again. The following example, prints found var foo for every field foo in the input:

lexer grammar FuzzyJava;
options {filter=true;}

FIELD
    :   TYPE WS name=ID '[]'? WS? (';'|'=')
        {System.out.println("found var "+$name.text);}
    ;

fragment
TYPE :   ID ('.' ID)*
        ;

fragment
ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
    ;

WS  :   (' '|'\t'|'\n')+
    ;

Don't forget that you must ignore text in comments, so add another rule:

COMMENT
    :   '/*' (options {greedy=false;} : . )* '*/'
        {System.out.println("found comment "+getText());}
    ;

Related content

Grammars
Grammars
Read with this
ANTLR Cheat Sheet
ANTLR Cheat Sheet
Read with this
Tree pattern matching
Tree pattern matching
Read with this
Composite Grammars
Composite Grammars
Read with this
ANTLR v3 printable documentation
ANTLR v3 printable documentation
Read with this
Quick Starter on Parser Grammars - No Past Experience Required
Quick Starter on Parser Grammars - No Past Experience Required
Read with this