Some language translation problems can be described with a few rewrite rules that are predicated upon their context and potentially an arbitrary Boolean expression. For example, consider a few identity transformations such as
expr: // in context of expr, match left side, transformed to right side "x + 0" -> "x" "0 + x" -> "x"
You do not have to specify these rewrite rules inside a grammar, assuming you have a parser grammar that can parse the input (and has rule expr). The use of such concrete syntax notation is very easy for humans to look at. The only problem comes in when you have perhaps 100 of these rules. Their interaction can sometimes lead to bugs that are impossible to discover. Nonetheless, a number of academic systems exist that use a series of rewrite rules that exist separately from the syntax grammar (e.g., Meta-Env (ASF+SFD) and Stratego). At least from the journal papers, it seems they are very powerful. Something must be preventing the average developer from using them, however.
Sometime I think I will spend a couple of weeks and try to build one myself ANTLR-style. The first thing I realize is that concrete syntax, while beautiful, is not always powerful enough to do what we want (I suppose it could be an option). For example, what if you want to match a series of identifiers and transform them to something else? You need some kind of repetition operator inside the strings of concrete syntax, but why invent new syntax. We have grammars to specify languages. Grammars both recognize and generate language as you see in ANTLR parser grammar to treat grammar fragment rewrite rules. So, I propose something more along the lines of grammar to grammar transformations. Elements on the left are available as streams of data to the template on the right-hand side:
expr: type ID (',' ID)* -> (type ID ';')+
In this case, whatever was matched for the type
rule would be replicated for each identifier match on the input stream. Input tokens int i,j
would become output token stream int i; int j;
. This token stream could then be repeatedly processed by the rules until nothing had changed, signifying the end of translation. Naturally, one could design rewrite rules that flip back and forth between two patterns causing an infinite loop. This is a well-known problem and rewrite systems. I'm satisfied to send the call that a bug in your rewrite rules, though I'm sure I could come up with something nice to let you know precisely which rules were the problem.
You could also go to text instead of a token stream. In this case, the output grammar would actually be a StringTemplate (an "unparser"). For example, the following transformation would convert the token stream matched on the left hand push all of the elements into the attributes of the template on the right:
expr: type ID (',' ID)* -> "<type> <ID; separator=\", \">"