Left-Recursion Removal
Introduction
Writing grammars, one will encounter left-recursive rules. As ANTLR is a recursive-descent parser, it cannot cope with them and thus they have to be removed. But what is a left-recursive rule exactly? It is a recursive rule which calls itself on the left-edge of a global alternative. An example would be:
a : a b | b ;
ANTLRworks can resolve this problem and turns rule a into
a : b+;
But ANTLRworks can do this automatically only for the above kind of left-recursive rules, which exhibit the left recursion within the implementation of the recursive rule itself. I would call these single left-recursive (SLR) rules. The other kind are mutually left-recursive (MLR) rules where the left-recursion occurs over several rule invocations. The simplest example would be the following:
a : b | c ; b : a | c ;
Rule a references rule b, while rule b references rules a. A more complex example would show such a 're-reference' taking place after many subsequent rule invocations. As stated above, the resolvement has to be done manually here.
As the strategy to solve this problem isn't that obvious, I decided to write a tutorial about two real grammar cases I have encountered while working on a C# grammar. It should be noted that this tutorial describes what I wished to have happened, as I visited many more dead-ends than noted. Furthermore, the strategy to resolve MLR rules is based on the description by Gavin Lambert.