Child pages
  • Antlr3PerlTarget
Skip to end of metadata
Go to start of metadata

Perl ANTLR v3 Target

Author

Ron Blaschke, ron at rblasch.org

Status

Early prototyping phase.  Simple lexer and parser are working.

Progress

Here's a simple example.  Note that everything is still subject to change.

$ cat T.g
lexer grammar T;
options { language = Perl5; }
ZERO: '0';
ONE: '1';
$ cat T.tokens
Tokens=6
ZERO=4
ONE=5
$ cat t.pl
#!/usr/bin/perl

use ANTLR::Runtime::ANTLRStringStream;
use TLexer;

use strict;
use warnings;

my $input = ANTLR::Runtime::ANTLRStringStream->new('010');
my $lexer = TLexer->new($input);

while (1) {
    my $token = $lexer->next_token();
    last if $token->get_type() == $TLexer::EOF;

    print "type: ", $token->get_type(), "\n";
    print "text: ", $token->get_text(), "\n";
    print "\n";
}
$ perl t.pl
type: 4
text: 0

type: 5
text: 1

type: 4
text: 0

2007-06-13

+ Escaped characters, like '\n', are now handled properly.

+ Added  error handling.

lexer grammar T2;
options { language = Perl5; }

ID  :   ('a'..'z'|'A'..'Z')+ ;
INT :   '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS  :   (' '|'\t')+ ;
INT=5
WS=7
Tokens=8
ID=4
NEWLINE=6
#!/usr/bin/perl

use ANTLR::Runtime::ANTLRStringStream;
use T2Lexer;

use strict;
use warnings;

my $input = ANTLR::Runtime::ANTLRStringStream->new("Hello World!\n42\n");
my $lexer = T2Lexer->new($input);

while (1) {
    my $token = $lexer->next_token();
    last if $token->get_type() == $T2Lexer::EOF;

    print "type: ", $token->get_type(), "\n";
    print "text: ", $token->get_text(), "\n";
    print "\n";
}
type: 4
text: Hello

type: 7
text:

type: 4
text: World

line 1:12 no viable alternative at character '!'
type: 6
text:


type: 5
text: 42

type: 6
text:

Note the "no viable alternative" error message for the unrecognized '!'.

 2007-06-15

+  Handle lexer actions

Here's another  short example, similar to the one above.  Note how whitespaces are put into the hidden channel (99) and newlines are skipped.

lexer grammar T2;
options { language = Perl5; }

ID  :   ('a'..'z'\|'A'..'Z')\+ ;
INT :   '0'..'9'\+ ;
NEWLINE:'\r'? '\n' { $self->skip(); } ;
WS  :   (' '\|'\t')\+ { $channel = HIDDEN; } ;
$ perl t.pl
text: Hello
type: 4
pos: 1:0
channel: 0
token index: -1

text:
type: 7
pos: 1:5
channel: 99
token index: -1

text: World
type: 4
pos: 1:6
channel: 0
token index: -1

line 1:11 no viable alternative at character '!'
text: 42
type: 5
pos: 2:0
channel: 0
token index: -1

2007-06-26

+ Simple Parser is working

Quick, what is 2 + 2?   If you can't remember here's an easy way to find out.  First we need a grammar.

grammar MExpr;

options {
  language = Perl5;
}

prog:   stat+ ;

stat:   expr NEWLINE { print "$expr.value\n"; }
    |   NEWLINE
    ;

expr returns [value]
    :   e=atom { $value = $e.value; }
        (   '+' e=atom { $value += $e.value; }
        |   '-' e=atom { $value -= $e.value; }
        )*
    ;

atom returns [value]
    :   INT { $value = $INT.text; }
    |   '(' expr ')' { $value = $expr.value; }
    ;

ID  :   ('a'..'z'|'A'..'Z')+ ;
INT :   '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS  :   (' '|'\t')+ { $self->skip(); } ;

 And here's the test program.

#!/usr/bin/perl

use strict;
use warnings;

use ANTLR::Runtime::ANTLRStringStream;
use ANTLR::Runtime::CommonTokenStream;
use MExprLexer;
use MExprParser;

while (<>) {
    my $input = ANTLR::Runtime::ANTLRStringStream->new($_);
    my $lexer = MExprLexer->new($input);

    my $tokens = ANTLR::Runtime::CommonTokenStream->new({ token_source => $lexer });
    my $parser = MExprParser->new($tokens);
    $parser->prog();
}

 Finally we're getting to the answer.

$ perl t.pl
2 + 2
4

2007-08-08

 + Simple expression grammar

The grammar

grammar Expr;

options {
    language = Perl5;
}

@header {
}

@members {
    my %memory;
}

prog:   stat+ ;

stat:   expr NEWLINE { print "$expr.value\n"; }
    |   ID '=' expr NEWLINE
        { $memory{$ID.text} = $expr.value; }
    |   NEWLINE
    ;

expr returns [value]
    :   e=multExpr { $value = $e.value; }
        (   '+' e=multExpr { $value += $e.value; }
        |   '-' e=multExpr { $value -= $e.value; }
        )*
    ;

multExpr returns [value]
    :   e=atom { $value = $e.value; } ('*' e=atom { $value *= $e.value; })*
    ;

atom returns [value]
    :   INT { $value = $INT.text; }
    |   ID
        {
            my $v = $memory{$ID.text};
            if (defined $v) {
                $value = $v;
            } else {
                print STDERR "undefined variable $ID.text\n";
            }
        }
    |   '(' expr ')' { $value = $expr.value; }
    ;

ID  :   ('a'..'z'|'A'..'Z')+ ;
INT :   '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS  :   (' '|'\t')+ { $self->skip(); } ;

 Test program

#!/usr/bin/perl

use strict;
use warnings;

use blib '../..';

use ANTLR::Runtime::ANTLRStringStream;
use ANTLR::Runtime::CommonTokenStream;
use ExprLexer;
use ExprParser;

my $in;
{
    undef $/;
    $in = <>;
}

my $input = ANTLR::Runtime::ANTLRStringStream->new($in);
my $lexer = ExprLexer->new($input);

my $tokens = ANTLR::Runtime::CommonTokenStream->new({ token_source => $lexer });
my $parser = ExprParser->new($tokens);
$parser->prog();

Test run

$ perl t.pl
x=1
y=2
3*(x+y)
^Z
9

2008-02-23

 Started real porting effort.  The goal is to port one ANTLR runtime class at a time from Java to Perl, including full API coverage and documentation.  First stop of the porting train: ANTLR::Runtime::BitSet.

2008-11-18

Got the first parser working: SimpleCalc, taken from the Five minute introduction to ANTLR 3.

Author

Ronald Blaschke (ron at rblasch org)

  • No labels

4 Comments

  1. Unknown User (loopology)

    Where can the ANTLR Perl modules be downloaded from ?

    Thanx

    Bernhard

  2. Unknown User (rblasch)

    There's currently no build for it.  Please grab the latest version from http://fisheye2.cenqua.com/browse/antlr .

    Ron

  3. Unknown User (george godik)

    Hi Ron

    I'm interested in leveraging my current Perl infrastructure to produce a DSL for MySQL dataloading and maintenance related tasks.

    Very interested in this port. Keep up the good work !

    Thank you

    - George 

  4. Unknown User (rblasch)

    Hello George,

    Many thanks for the encouraging feedback! (smile)

     Ron