Page Comparison

Update Oct 2012: resolved with correct definition of when to terminate prediction lookahead. Turns out we don't need to push this extra stack element.

Over the Christmas holidays, I've been busy building example grammars for ANTLR v4. The thing I noticed immediately is that grammars just work. There are no error messages from ANTLR when generating code and all we can get are true ambiguity errors at runtime. E.g., if you can recognize T(i) as both a function call and a constructor call. The lexers work extremely well and I'm a happy guy. Nothing scares ANTLR v4, "it's pretty bad-ass" like the The Crazy Nastyass Honey Badger.

...

Imagine we have R expression a(i)<-x that we want to parse starting at rule prog in the following grammar.

Code Block
prog: expr_or_assign* ; expr_or_assign : expr '++' \| expr ; expr : expr_primary ('<-' ID)? ; expr_primary : '(' ID ')' \| ID '(' ID ')' \| ID ;

...

After much experimentation and creation of software to trace prediction through the augmented transition network (ATN) representation of grammar, I discovered the source of the problem. To highlight the issue, all we have to do is use the recursive equivalent of prog:

Code Block
prog: expr_or_assign prog \| ;

Or, for simplicity, the following manual repetition of expr_or_assign exhibits the same behavior for our input "a(i)<-x".

Code Block
prog: expr_or_assign expr_or_assign ;

...

The problem is that ANTLR destroys context information when it converts the recursive looping rule into the equivalent EBNF expr_or_assign* loop. Imagine returning from rule expr_or_assign in this version of prog:

Code Block
prog: expr_or_assign expr_or_assign ;

...

Versions Compared

Old Version 20

New Version 21

Key