Ok, Kay Roepke is in town and we've been discussing the faster expression parsing, among other things. Look for another entry on default StringTemplate generation or a parsing and tree parsing.
ANTLR v3.2 will allow special rules for specifying expressions that are particularly efficient both in speed and space. Special rules will define either unary suffix operators, binary operators, or trinary operators by virtue of how they recurse. Precedence of operators is specified by the order in which the alternatives appear in the special rule. Left associativity is assumed. Right associativity is done with a token level option, TOKEN<associativity=right>
. The alternatives that are not specifying operators or that specify unary operators are grouped into a new rule. A special rule is rewritten into one that does not have left recursion. Semantic predicates are used to recurse more or less depending on the precedence of the next operator (first symbol of lookahead). Here is the special rule:
e : parse_expr[0] ; // rewrite e to this rule, which invokes the precedence engine /** "Precedence climbing" engine parse_expr[int p] : (primary->primary) ( {prec[input.LA(1)]>=p}?=> bop r=e[nextp(p)] -> ^(bop $e $r) | {postprec[input.LA(1)]>=p}?=> suffix[$e.tree] -> {$suffix.tree} )* ;
where rule suffix
is described below (basically we remove e
references from the left edge).
If you specify a rule with option parser=precedence, ANTLR does the new magical thing:
e options {parser=precedence;} : primary // highest precedence | e '++' | e '.' ID // higher than array/method call | suffix | ('++'|'--') e | '(' type ')' e // cast | e '*' e | e ('+'|'-') e | e '='<associativity=right> e // lowest precedence ; suffix : e '[' e ']' | e '(' e (',' e)* ')' ; primary : '(' e ')' | INT | ID ;
Within the special rule, ANTLR looks for binary operators and the suffix operators. It also looks one level deep into rules referenced from the special rule. In this case, ANTLR identifies '*' '+' '-' '=' as binary operators and '.' '[' '(' as suffix operators. All other alternatives are lumped together into a new rule:
parse_expr_alts : primary | '(' type ')' e // cast ;
The suffix rule has left recursive references to e
sell those must be removed. The parse_expr rule invokes a suffix after having matched e
already:
suffix : '[' e ']' | '(' e (',' e)* ')' ;