ANTLR 3.3 Release Notes

ANTLR v3.3

November 29, 2010

Terence Parr
ANTLR project lead and supreme dictator for life
University of San Francisco
Credits

ANTLR v3.3 is primarily a bug fix release but has 2 new targets.

New targets!

Ruby

Kyle Yetter has built a new and complete Ruby target for 3.3. More info.
Ruby target examples.

Objective-C

Alan Condit has built an Objective-C target for 3.3 following on Kay Roepke's work.

Important things to note

  • Removed conversion timeout failsafe; no longer needed really. If the unlikely occurs and ANTLR spins forever, kill it and run again with -Xwatchconversion. You'll see things like:
    building lookahead DFA (d=39) for ()+ loopback of 109:9: ( ' ' | '\t' | ( '\n' | '\r\n' | '\r' ) )+
    
    If it gets stuck, look at the decision (line 109 here) and add k=1 and backtracking or something.
  • Stats/profiling updated to be correct for -report and -profile but the output is very different. -report:
    Java.compilationUnit:181:32 decision 1: k=1
    Java.compilationUnit:181:51 decision 2: k=1
    Java.compilationUnit:182:41 decision 3: k=1
    Java.compilationUnit:181:9 decision 4: k=1
    Java.compilationUnit:184:9 decision 5: k=1
    Java.compilationUnit:184:29 decision 6: k=1
    Java.compilationUnit:184:48 decision 7: k=1
    Java.compilationUnit:180:5 decision 8: k=1 backtracks
    ...
    
    And -profile:
    ANTLR Runtime Report; Profile Version 3
    parser name JavaParser
    Number of rule invocations 1798
    Number of unique rules visited 69
    Number of decision events 2196
    Overall average k per decision event 1.1684881
    Number of backtracking occurrences (can be multiple per decision) 78
    Overall average k per decision event that backtracks 4.9871793
    Number of rule invocations while backtracking 713
    num decisions that potentially backtrack 12
    num decisions that do backtrack 10
    num decisions that potentially backtrack but don't 2
    average % of time a potentially backtracking decision backtracks 60.373898
    num unique decisions covered 82
    max rule invocation nesting depth 97
    rule memoization cache size 728
    number of rule memoization cache hits 15
    number of rule memoization cache misses 713
    number of tokens 329
    number of hidden tokens 150
    number of char 0
    number of hidden char 0
    number of syntax errors 0
    
    location	n	avgk	maxk	synpred	sempred	canbacktrack
    8@Java.g:179:1(compilationUnit)	1	1.00	1	0	0	1
    5@Java.g:184:9(compilationUnit)	1	1.00	1	0	0	0
    6@Java.g:184:29(compilationUnit)	3	1.00	1	0	0	0
    9@Java.g:192:18(importDeclaration)	2	1.00	1	0	0	0
    86@Java.g:504:20(qualifiedName)	7	1.71	2	0	0	0
    10@Java.g:192:42(importDeclaration)	2	1.00	1	0	0	0
    7@Java.g:184:48(compilationUnit)	2	1.00	1	0	0	0
    11@Java.g:195:1(typeDeclaration)	1	1.00	1	0	0	0
    13@Java.g:205:9(classOrInterfaceModifiers)	1	1.00	1	0	0
    ...
    
  • Doesn't write profile data to file anymore; emits decision data to stderr
  • There are Java generics in ANTLR itself and runtime; shouldn't be a problem because we generate Java 1.4 compatible binaries.
  • The DebugEventListener interface changed [BREAKS BACKWARD COMPATIBILITY]

Improvements

  • Added source name to syntax error msgs
  • added new method to get subset of tokens to buffered token streams:
    public List get(int start, int stop);
  • Refs to other tokens in a lexer rule didn't get its line/charpos right. altered Java.stg.
  • Instead of sharing Token.EOF_TOKEN, I'm now creating EOF tokens so I can set the char position for better error messages.
  • added new buffered on-demand streams: BufferedTokenStream. Renamed CommonTokenStream to LegacyCommonTokenStream and made new one as subclass of BufferedTokenStream.
  • Added org.antlr.runtime.UnbufferedTokenStream. Was trivial and works!
  • added range to TokenStream and implementors:
        /** How far ahead has the stream been asked to look?  The return
         *  value is a valid index from 0..n-1.
         */
        int range();
    
  • Added MachineProbe class to make it easier to highlight ambig paths in
    grammar. More accurate than DecisionProbe; retrofitted from v4.
  • lexerStringRef was missing elementIndex attribute. i='import' didn't work
    in lexer. Altered all target stg files. Set in codegen.g
  • Added -Xsavelexer option
  • greedy=true option shuts off nondeterminism warning.
  • Tried to make output more deterministic:
    • added toArray in OrderedHashSet to make addAll calls get same order for DFA edges and possibly code gen in some areas.
    • Made OrderedHashSet have deterministic iteration

Changes / Fixes

  • code gen for AST and -profile didn't compile. had useless line:
                 proxy.setTreeAdaptor(adap);
    
  • Missing -trace in help msg
  • Added boolean decisionCanBacktrack to Parser and enterDecision in dbg interface. Breaks AW
    interface and other tools! [BREAKS BACKWARD COMPATIBILITY]

Java Target

  • output=AST, rewrite=true for tree rewriters broken. nextNode for subtree
    streams didn't dup node, it gave whole tree back.
  • Creating token from another token didn't copy input stream in CommonToken. makes sense to copy too; i don't think anybody relies on it being null after a copy. We might want to know where token came from.
  • TreeParser.getMissingSymbol() used CommonTree instead of using adaptor.create()
  • Fixed bug in TreeVisitor when rewrites altered number of children. Thanks to Chris DiGiano.
  • Couldn't properly reuse parser state; ctor reset the state; fixed.
    Parser(TokenStream input, RecognizerSharedState state)
  • LookaheadStream<T> used some hardcoded Object return types for LT, etc...
    uses T now.

From bug tracking system