ANTLR v3.3

November 29, 2010

Terence Parr
ANTLR project lead and supreme dictator for life
University of San Francisco
Credits

ANTLR v3.3 is primarily a bug fix release but has 2 new targets.

New targets!

Ruby

Kyle Yetter has built a new and complete Ruby target for 3.3. More info.
Ruby target examples.

Objective-C

Alan Condit has built an Objective-C target for 3.3 following on Kay Roepke's work.

Important things to note

Removed conversion timeout failsafe; no longer needed really. If the unlikely occurs and ANTLR spins forever, kill it and run again with -Xwatchconversion. You'll see things like:
```
building lookahead DFA (d=39) for ()+ loopback of 109:9: ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+
```
If it gets stuck, look at the decision (line 109 here) and add k=1 and backtracking or something.

Stats/profiling updated to be correct for -report and -profile but the output is very different. -report:

Java.compilationUnit:181:32 decision 1: k=1
Java.compilationUnit:181:51 decision 2: k=1
Java.compilationUnit:182:41 decision 3: k=1
Java.compilationUnit:181:9 decision 4: k=1
Java.compilationUnit:184:9 decision 5: k=1
Java.compilationUnit:184:29 decision 6: k=1
Java.compilationUnit:184:48 decision 7: k=1
Java.compilationUnit:180:5 decision 8: k=1 backtracks
...

And -profile:

ANTLR Runtime Report; Profile Version 3
parser name JavaParser
Number of rule invocations 1798
Number of unique rules visited 69
Number of decision events 2196
Overall average k per decision event 1.1684881
Number of backtracking occurrences (can be multiple per decision) 78
Overall average k per decision event that backtracks 4.9871793
Number of rule invocations while backtracking 713
num decisions that potentially backtrack 12
num decisions that do backtrack 10
num decisions that potentially backtrack but don't 2
average % of time a potentially backtracking decision backtracks 60.373898
num unique decisions covered 82
max rule invocation nesting depth 97
rule memoization cache size 728
number of rule memoization cache hits 15
number of rule memoization cache misses 713
number of tokens 329
number of hidden tokens 150
number of char 0
number of hidden char 0
number of syntax errors 0

location	n	avgk	maxk	synpred	sempred	canbacktrack
8@Java.g:179:1(compilationUnit)	1	1.00	1	0	0	1
5@Java.g:184:9(compilationUnit)	1	1.00	1	0	0	0
6@Java.g:184:29(compilationUnit)	3	1.00	1	0	0	0
9@Java.g:192:18(importDeclaration)	2	1.00	1	0	0	0
86@Java.g:504:20(qualifiedName)	7	1.71	2	0	0	0
10@Java.g:192:42(importDeclaration)	2	1.00	1	0	0	0
7@Java.g:184:48(compilationUnit)	2	1.00	1	0	0	0
11@Java.g:195:1(typeDeclaration)	1	1.00	1	0	0	0
13@Java.g:205:9(classOrInterfaceModifiers)	1	1.00	1	0	0
...

Doesn't write profile data to file anymore; emits decision data to stderr
There are Java generics in ANTLR itself and runtime; shouldn't be a problem because we generate Java 1.4 compatible binaries.
The DebugEventListener interface changed [BREAKS BACKWARD COMPATIBILITY]

Improvements

Added source name to syntax error msgs
added new method to get subset of tokens to buffered token streams:
public List get(int start, int stop);
Refs to other tokens in a lexer rule didn't get its line/charpos right. altered Java.stg.
Instead of sharing Token.EOF_TOKEN, I'm now creating EOF tokens so I can set the char position for better error messages.
added new buffered on-demand streams: BufferedTokenStream. Renamed CommonTokenStream to LegacyCommonTokenStream and made new one as subclass of BufferedTokenStream.
Added org.antlr.runtime.UnbufferedTokenStream. Was trivial and works!

added range to TokenStream and implementors:

    /** How far ahead has the stream been asked to look?  The return
     *  value is a valid index from 0..n-1.
     */
    int range();

Added MachineProbe class to make it easier to highlight ambig paths in
grammar. More accurate than DecisionProbe; retrofitted from v4.
lexerStringRef was missing elementIndex attribute. i='import' didn't work
in lexer. Altered all target stg files. Set in codegen.g
Added -Xsavelexer option
greedy=true option shuts off nondeterminism warning.
Tried to make output more deterministic:
- added toArray in OrderedHashSet to make addAll calls get same order for DFA edges and possibly code gen in some areas.
- Made OrderedHashSet have deterministic iteration

Changes / Fixes

code gen for AST and -profile didn't compile. had useless line:
```
             proxy.setTreeAdaptor(adap);
```
Missing -trace in help msg
Added boolean decisionCanBacktrack to Parser and enterDecision in dbg interface. Breaks AW
interface and other tools! [BREAKS BACKWARD COMPATIBILITY]

Java Target

output=AST, rewrite=true for tree rewriters broken. nextNode for subtree
streams didn't dup node, it gave whole tree back.
Creating token from another token didn't copy input stream in CommonToken. makes sense to copy too; i don't think anybody relies on it being null after a copy. We might want to know where token came from.
TreeParser.getMissingSymbol() used CommonTree instead of using adaptor.create()
Fixed bug in TreeVisitor when rewrites altered number of children. Thanks to Chris DiGiano.
Couldn't properly reuse parser state; ctor reset the state; fixed.
Parser(TokenStream input, RecognizerSharedState state)
LookaheadStream<T> used some hardcoded Object return types for LT, etc...
uses T now.

ANTLR 3

ANTLR 3.3 Release Notes