Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Perl ANTLR v3 Target

Status

Early prototyping phase.  Simple lexer and parser are working.

Progress

Here's a simple example.  Note that everything is still subject to change.

$ cat T.g
lexer grammar T;
options { language = Perl5; }
ZERO: '0';
ONE: '1';
$ cat T.tokens
Tokens=6
ZERO=4
ONE=5
$ cat t.pl
#!/usr/bin/perl

use ANTLR::Runtime::ANTLRStringStream;
use TLexer;

use strict;
use warnings;

my $input = ANTLR::Runtime::ANTLRStringStream->new('010');
my $lexer = TLexer->new($input);

while (1) {
    my $token = $lexer->next_token();
    last if $token->get_type() == $TLexer::EOF;

    print "type: ", $token->get_type(), "\n";
    print "text: ", $token->get_text(), "\n";
    print "\n";
}
$ perl t.pl
type: 4
text: 0

type: 5
text: 1

type: 4
text: 0

2007-06-13

+ Escaped characters, like '\n', are now handled properly.

+ Added  error handling.

lexer grammar T2;
options { language = Perl5; }

ID  :   ('a'..'z'|'A'..'Z')+ ;
INT :   '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS  :   (' '|'\t')+ ;
INT=5
WS=7
Tokens=8
ID=4
NEWLINE=6
#!/usr/bin/perl

use ANTLR::Runtime::ANTLRStringStream;
use T2Lexer;

use strict;
use warnings;

my $input = ANTLR::Runtime::ANTLRStringStream->new("Hello World!\n42\n");
my $lexer = T2Lexer->new($input);

while (1) {
    my $token = $lexer->next_token();
    last if $token->get_type() == $T2Lexer::EOF;

    print "type: ", $token->get_type(), "\n";
    print "text: ", $token->get_text(), "\n";
    print "\n";
}
type: 4
text: Hello

type: 7
text:

type: 4
text: World

line 1:12 no viable alternative at character '!'
type: 6
text:


type: 5
text: 42

type: 6
text:

Note the "no viable alternative" error message for the unrecognized '!'.

 2007-06-15

+  Handle lexer actions

Here's another  short example, similar to the one above.  Note how whitespaces are put into the hidden channel (99) and newlines are skipped.

lexer grammar T2;
options { language = Perl5; }

ID  :   ('a'..'z'\|'A'..'Z')\+ ;
INT :   '0'..'9'\+ ;
NEWLINE:'\r'? '\n' { $self->skip(); } ;
WS  :   (' '\|'\t')\+ { $channel = HIDDEN; } ;
$ perl t.pl
text: Hello
type: 4
pos: 1:0
channel: 0
token index: -1

text:
type: 7
pos: 1:5
channel: 99
token index: -1

text: World
type: 4
pos: 1:6
channel: 0
token index: -1

line 1:11 no viable alternative at character '!'
text: 42
type: 5
pos: 2:0
channel: 0
token index: -1

2007-06-26

+ Simple Parser is working

Quick, what is 2 + 2?   If you can't remember here's an easy way to find out.  First we need a grammar.

grammar MExpr;

options {
  language = Perl5;
}

prog:   stat+ ;

stat:   expr NEWLINE { print "$expr.value\n"; }
    |   NEWLINE
    ;

expr returns [value]
    :   e=atom { $value = $e.value; }
        (   '+' e=atom { $value += $e.value; }
        |   '-' e=atom { $value -= $e.value; }
        )*
    ;

atom returns [value]
    :   INT { $value = $INT.text; }
    |   '(' expr ')' { $value = $expr.value; }
    ;

ID  :   ('a'..'z'|'A'..'Z')+ ;
INT :   '0'..'9'+ ;
NEWLINE:'\r'? '\n' ;
WS  :   (' '|'\t')+ { $self->skip(); } ;

 And here's the test program.

#!/usr/bin/perl

use strict;
use warnings;

use ANTLR::Runtime::ANTLRStringStream;
use ANTLR::Runtime::CommonTokenStream;
use MExprLexer;
use MExprParser;

while (<>) {
    my $input = ANTLR::Runtime::ANTLRStringStream->new($_);
    my $lexer = MExprLexer->new($input);

    my $tokens = ANTLR::Runtime::CommonTokenStream->new({ token_source => $lexer });
    my $parser = MExprParser->new($tokens);
    $parser->prog();
}

 Finally we're getting to the answer.

$ perl t.pl
2 + 2
4



Author

Ronald Blaschke (ron at rblasch org)

  • No labels