Antlr3PerlTarget
Perl ANTLR v3 Target
Author
Ron Blaschke, ron at rblasch.org
Status
Early prototyping phase. Simple lexer and parser are working.
Progress
Here's a simple example. Note that everything is still subject to change.
$ cat T.g
lexer grammar T;
options { language = Perl5; }
ZERO: '0';
ONE: '1';
$ cat T.tokens Tokens=6 ZERO=4 ONE=5
$ cat t.pl
#!/usr/bin/perl
use ANTLR::Runtime::ANTLRStringStream;
use TLexer;
use strict;
use warnings;
my $input = ANTLR::Runtime::ANTLRStringStream->new('010');
my $lexer = TLexer->new($input);
while (1) {
my $token = $lexer->next_token();
last if $token->get_type() == $TLexer::EOF;
print "type: ", $token->get_type(), "\n";
print "text: ", $token->get_text(), "\n";
print "\n";
}
$ perl t.pl type: 4 text: 0 type: 5 text: 1 type: 4 text: 0
2007-06-13
+ Escaped characters, like '\n', are now handled properly.
+ Added error handling.
lexer grammar T2;
options { language = Perl5; }
ID : ('a'..'z'|'A'..'Z')+ ;
INT : '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS : (' '|'\t')+ ;
INT=5 WS=7 Tokens=8 ID=4 NEWLINE=6
#!/usr/bin/perl
use ANTLR::Runtime::ANTLRStringStream;
use T2Lexer;
use strict;
use warnings;
my $input = ANTLR::Runtime::ANTLRStringStream->new("Hello World!\n42\n");
my $lexer = T2Lexer->new($input);
while (1) {
my $token = $lexer->next_token();
last if $token->get_type() == $T2Lexer::EOF;
print "type: ", $token->get_type(), "\n";
print "text: ", $token->get_text(), "\n";
print "\n";
}
type: 4 text: Hello type: 7 text: type: 4 text: World line 1:12 no viable alternative at character '!' type: 6 text: type: 5 text: 42 type: 6 text:
Note the "no viable alternative" error message for the unrecognized '!'.
2007-06-15
+ Handle lexer actions
Here's another short example, similar to the one above. Note how whitespaces are put into the hidden channel (99) and newlines are skipped.
lexer grammar T2;
options { language = Perl5; }
ID : ('a'..'z'\|'A'..'Z')\+ ;
INT : '0'..'9'\+ ;
NEWLINE:'\r'? '\n' { $self->skip(); } ;
WS : (' '\|'\t')\+ { $channel = HIDDEN; } ;
$ perl t.pl text: Hello type: 4 pos: 1:0 channel: 0 token index: -1 text: type: 7 pos: 1:5 channel: 99 token index: -1 text: World type: 4 pos: 1:6 channel: 0 token index: -1 line 1:11 no viable alternative at character '!' text: 42 type: 5 pos: 2:0 channel: 0 token index: -1
2007-06-26
+ Simple Parser is working
Quick, what is 2 + 2? If you can't remember here's an easy way to find out. First we need a grammar.
grammar MExpr;
options {
language = Perl5;
}
prog: stat+ ;
stat: expr NEWLINE { print "$expr.value\n"; }
| NEWLINE
;
expr returns [value]
: e=atom { $value = $e.value; }
( '+' e=atom { $value += $e.value; }
| '-' e=atom { $value -= $e.value; }
)*
;
atom returns [value]
: INT { $value = $INT.text; }
| '(' expr ')' { $value = $expr.value; }
;
ID : ('a'..'z'|'A'..'Z')+ ;
INT : '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS : (' '|'\t')+ { $self->skip(); } ;
And here's the test program.
#!/usr/bin/perl
use strict;
use warnings;
use ANTLR::Runtime::ANTLRStringStream;
use ANTLR::Runtime::CommonTokenStream;
use MExprLexer;
use MExprParser;
while (<>) {
my $input = ANTLR::Runtime::ANTLRStringStream->new($_);
my $lexer = MExprLexer->new($input);
my $tokens = ANTLR::Runtime::CommonTokenStream->new({ token_source => $lexer });
my $parser = MExprParser->new($tokens);
$parser->prog();
}
Finally we're getting to the answer.
$ perl t.pl 2 + 2 4
2007-08-08
+ Simple expression grammar
The grammar
grammar Expr;
options {
language = Perl5;
}
@header {
}
@members {
my %memory;
}
prog: stat+ ;
stat: expr NEWLINE { print "$expr.value\n"; }
| ID '=' expr NEWLINE
{ $memory{$ID.text} = $expr.value; }
| NEWLINE
;
expr returns [value]
: e=multExpr { $value = $e.value; }
( '+' e=multExpr { $value += $e.value; }
| '-' e=multExpr { $value -= $e.value; }
)*
;
multExpr returns [value]
: e=atom { $value = $e.value; } ('*' e=atom { $value *= $e.value; })*
;
atom returns [value]
: INT { $value = $INT.text; }
| ID
{
my $v = $memory{$ID.text};
if (defined $v) {
$value = $v;
} else {
print STDERR "undefined variable $ID.text\n";
}
}
| '(' expr ')' { $value = $expr.value; }
;
ID : ('a'..'z'|'A'..'Z')+ ;
INT : '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS : (' '|'\t')+ { $self->skip(); } ;
Test program
#!/usr/bin/perl
use strict;
use warnings;
use blib '../..';
use ANTLR::Runtime::ANTLRStringStream;
use ANTLR::Runtime::CommonTokenStream;
use ExprLexer;
use ExprParser;
my $in;
{
undef $/;
$in = <>;
}
my $input = ANTLR::Runtime::ANTLRStringStream->new($in);
my $lexer = ExprLexer->new($input);
my $tokens = ANTLR::Runtime::CommonTokenStream->new({ token_source => $lexer });
my $parser = ExprParser->new($tokens);
$parser->prog();
Test run
$ perl t.pl x=1 y=2 3*(x+y) ^Z 9
2008-02-23
Started real porting effort. The goal is to port one ANTLR runtime class at a time from Java to Perl, including full API coverage and documentation. First stop of the porting train: ANTLR::Runtime::BitSet.
2008-11-18
Got the first parser working: SimpleCalc, taken from the Five minute introduction to ANTLR 3.
Author
Ronald Blaschke (ron at rblasch org)
