Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Johannes Luber (Maintainer)
jaluber AT gmx.de

Kunle Odutola
kunle UNDERSCORE odutola AT hotmail.com

Micheal Jordan

Contents

  • #Getting Started
  • #Architecture
  • #Status

    Status

    As of September 2007, the C# code generator and runtime are NOT in sync with the latest release and development versions of the ANTLR tool and Java language target. The last release of the C# code generator and runtime were sync'ed to ANTLR v3.0 release from July 2007. Nevertheless, no major problems have been reported by those using the C# codegen and runtime with ANTLR v3.0.1 since it's release in August 2007.

    As of October 2007, the ANTLR source depot contains an early beta of the C# codegen and runtime for the upcoming ANTLR v3.1 release. Starting from the end of October, ANTLR daily builds of v3.1 have been available for those wishing to test the C# support with the new features of ANTLR v3.1. As before, development progress going forwards is likely to be sporadic.

    ...

    Getting Started

    If you want to setup and start using ANTLR with C# as quickly as possible please see the FAQ page on using ANTLR, ANTLRWorks, and C#.

    However this page (not the FAQ page) provides architectural and background information useful to understanding the workings and usage of ANTLR and C#.

    Architecture

    The C# target consists of a set of code generation templates and a runtime library (written in C#) for the .NET/CLR platform. The C# code generation templates and the .NET/CLR runtime library are modeled on the Java version. As a consequence, the C# target supports features such as grammar development/prototyping and remote debugging with the superb ANTLRWorks integrated grammar development editor. Given ANTLRWorks popularity, this is very a important feature for ANTLR users.

    The .NET/CLR runtime library currently consists of two assemblies named Antlr3.Runtime.dll and Antlr3.Utility.dll. All projects that include an ANTLR v3.x Lexer, Parser or TreeParser must include a reference to:

    1. Antlr3.Runtime.dll - the ANTLR v3.x .NET/CLR runtime library
    2. Antlr3.Utility.dll - OPTIONAL - only required if you use the DOTTreeGenerator class (correct as of May 2008)

    No other references are required (except to the manadatory System assembly).

    For projects that use the in-built integration with StringTemplate, the following assemblies must also be referenced:

    1. StringTemplate.dll - the C# StringTemplate v3.x library (a.k.a ST# v3.x)
    2. antlr.runtime.dll - the ANTLR v2.7.x .NET/CLR runtime library (ST# v3.x was developed with ANTLR v2.7.x)

    Status

    In general, development progress on the C# target proceeds sporadically.  With versions of ANTLR prior to 3.1.x, the C# code generator cannot be guaranteed to be in sync with the latest versions of the ANTLR tools and Java target.  When practical to do so, the latest 3.1.x released version is recommended for the C# target.

    Version 3.1.x

    As of August 2008, the C# target is in sync with the Java target of ANTLR v33.1. The This new version of the C# target breaks source compatibility with the old versionprevious versions (including previous beta-v3.1 builds). To a certain extent, regeneration of grammars does help, but certain fields have been renamed to follow .NET conventions, which means that the first is capitalized (example: PascalCase (e.g. .tree is now .Tree). The exception is .st, which is now .ST. Additionally, the a new target named CSharp2 has been introduced in addition to the existing CSharp target. The reason for this is three-fold:

    1. Firstly,

    ...

    1. CSharp can retain it's compatibility with C# v1 and the .NET/CLR v1.1 platform. CSharp is restricted to C# v1 features and doesn't take advantage of any C#

    ...

    1. v2+ features in the code

    ...

    1. it generates.
    2. Secondly, a certain bug fix requires a C#

    ...

    1. v2 feature or, a change to the

    ...

    1. templates for each

    ...

    1. occurrence of the bug (and this has to be done by the user).

    ...

    1. Further details of

    ...

    1. this issue can be found in

    ...

    1. the #Known Issues section below.
    2. Thirdly, because maintaining the backwards compatibility sucks majorly,

    ...

    1. creating the new CSharp2 target allows the existing CSharp target to be deprecated without forcing the issue by simply abandoning C# v1 .NET/CLR v1.1 compatibility immediately.

    ...

    Introducing the new target allowed changes to be made to its distinct code generation templates without fear of breaking anything else. Furthermore, as working on further enhancements to CSharp2 will at least break binary compatibility in the .NET/CLR runtime which is currently shared with CSharp, most of the changes will be done for a future ANTLR v3.2 release. During the life-time of ANTLR v3.1, the public API of the C# target(s) will be frozen (only necessary bugfixes may break this). If you wish to future-proof your grammar, you should change them to use the new CSharp2 target. The original CSharp target . It is planned to remove the CSharp target for ANTLR v3.3that uses only C# v1 and .NET/CLR v1.1 features is deprecated and the current plan is to remove it for the ANTLR v3.2 release.

    The C# code generation templates and the .NET/CLR runtime library are feature complete for both the CSharp and CSharp2 targets. The Both C# targets leverage the existing C# StringTemplate implementations to support the broadest range of the features that ANTLR provides. The long open issue of unit tests has finally been tackled with the adoption of MbUnit and the inclusion (in the v3.1 version) of a wide range of tests for the runtime library. As before, basic sanity checks will done by ensuring that the sample grammars in the examples-v3 archive works as designedfunction correctly. This is currently a work-in-progess for the v3.1 release.

    Architecture

    As with all other targets, the C# code generation and runtime are modeled on the Java version. This means the C# target supports features such as grammar development/prototyping and remote debugging with the AntlrWorks GUI which is very important for ANTLR users.

    Target Platforms

    ...

    Target Platforms

    CSharp target (versions 3.0.x and 3.1.x)

    Microsoft .NET v1.1 and later
    Mono v1.0 and later

    CSharp2 target (version 3.1.x and later)

    Microsoft .NET v2.0 and later
    Mono v1.2 and later

    Runtime Location

    The compiled libraries are found in the distribution under the directory "runtime/csharp" (or "runtime/csharp/bin" in source distributions). Both targets use the same .NET/CLR runtime. Intermediary builds may not have the current version and can be compiled by using the build tools.

    ...

    Microsoft Visual Studio 2003, 2005 and 20052008
    Nant v0.85

    Performance

    The V3 target generates code that is easily faster than that generated by the V2 target (especially the lexers). We probably won't be able to match the bare-metal performance of the code generated by Jim Idle's C target or Ric Klaren's C++ target, but we expect to be very competitive with the other targets.

    Usage

    ...

    .

    Source Code and Binaries

    For ANTLR 3.0.x there is no C# target source code available. ANTLR 3.1.x has the files under the runtime/CSharp directory. Binaries are currently included in the distro subdirectory, but will be later available on the ANTLR download page as well. Available are the files in the official releases, in daily builds and for the head of the repo on the FishEye site (not reliable) or ask Terence Parr for a Perforce account.

    Bug Reports

    Bugs can be reported over the mailing list or directly to the email account of the current maintainer. The mailing list is preferable, because often the cause lies in the grammar and not in the tool itself. Using the list allows others besides the maintainer to exclude this cause, which has the side-effect that the bug report is processed quicker in general.

    Bug reports have to include a minimal grammar exposing the bug as well the ANTLR version used. If it is a compile time or a runtime problem, then the driver program, possible required backend files and, if applicable, the used input are also required. Optionally the reporter can check, if the Java target is affected by the same bug. If yes then one should report the bug to Terence Parr (over the mailing list or over the support form), as the C# target mimics the Java target behavior. Doing so isn't necessary (as the maintainer will compare the behaviors anyway), but speeds up the process. It may be possible that there is already a JIRA bug report. In that case, please mention it, too.

    Advanced Debugging

    In case, you want to go bug hunting yourself (maybe the next ANTLR release too far away), the following procedure should be helpful. First get the source code for both ANTLR and the C# runtime. Make sure that you can rebuild ANTLR yourself first. Then open one of the project files in the runtime directory (the VS 2003 project may be possibly outdated) and compile it in debug mode. Use the generated assemblies instead the pre-compiled ones in your compiler project.

    To find the bug, comparing the C# target output to the Java target is helpful. Once you have the Java-equivalent of the grammar (minimal grammar to expose the bug eases the translation) and the behavior of the C# target is different, you can compare the generated files. If you find a difference then the cause lies within a template. The use of the ANTLR option -XdbgST feeds the used templates also into the output, so it's quite easy to find the right template file. Change the template and rebuild ANTLR and your project. Then please report your findings, regardless if you fixed the issue or not.

    In case, the generated files are basically the same, you have to debug the runtime. Depending on the type of error, it may take you a while to find the cause. Comparing with the Java runtime can help. Once you have fixed the problem or don't know any further, report your findings. This is done via the ANTLR feedback form, as patches have to be send under the ANTLR Contribution License, or if you have signed up already, directly to the maintainer. A detailed description of the problem, possibly along with a patch (either created by the patch program or described in English) will make the fixing part easier on faster on the maintainer side.

    Usage

    This section is NOT a tutorial on how to use either C# or ANTLR v3.x. It assumes that you are familiar with the concepts involved in developing ANTLR v3.x grammars and in building and using C# programs and assemblies.

    Specify that C# code should be generated for a grammar

    To specify that the ANTLR tool should generate C# code (rather than the default of generating Java code) for a grammar, set the grammar-level option named language to the value CSharp2 as shown below:

    Code Block
    grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    
    // rest of grammar follows
    ....
    

    For the example grammar named MyGrammar above, the grammar file would typically be named MyGrammar.g. The grammar filename (excluding the extension) must match the grammar name as declared with the grammar directive in the file.

    Note
    titleANTLR v3.0.x users

    If you are still using ANTLR version v3.0.x then you can only specify CSharp as your target. The CSharp2 target is only supported by ANTLR v3.1.x and later.

    List of generated C# source files

    For an example grammar named MyGrammar, the following table list the files that would be generated by ANTLR 3.1+ using the CSharp2 (and CSharp) target.

    Panel
    bgColor#FFFFFF
    borderStylenone
    titleMyGrammar.g
    Code Block
    grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    // rest of grammar follows
    ....
    
    Panel
    bgColor#FFFFFF
    borderStylenone

    MyGrammarLexer.cs
    MyGrammarParser.cs

     

    Panel
    bgColor#FFFFFF
    borderStylenone
    titleMyGrammar.g
    Code Block
    lexer grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    // rest of grammar follows
    ....
    
    Panel
    bgColor#FFFFFF
    borderStylenone

    MyGrammar.cs

    Under ANLTR 3.0.x:
    MyGrammarLexer.cs

    Panel
    bgColor#FFFFFF
    borderStylenone
    titleMyGrammar.g
    Code Block
    parser grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    // rest of grammar follows
    ....
    

    ...

    Panel
    bgColor#FFFFFF
    borderStylenone

    MyGrammar.cs

    Under ANLTR 3.0.x:
    MyGrammarParser.cs

    Panel
    bgColor#FFFFFF
    borderStylenone
    titleMyGrammar.g
    Code Block
    tree grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    // rest of grammar follows
    ....
    
    Panel
    bgColor#FFFFFF
    borderStylenone

    MyGrammar.cs

     

    Specify a C# namespace for your recognizer

    You can specify that your generated recognizer should be declared within a specific namespace as shown below. By default all recognizers are generated as top-level types with no enclosing namespace.

    Code Block
    
    grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    
    @parser::namespace { My.Custom.NameSpace.For.Parser.In.Combined.Grammar } // Or just @namespace { ... }
    
    @lexer::namespace { My.Custom.NameSpace.For.Lexer.In.Combined.Grammar }
    
    // rest of grammar follows
    ....
    
    Code Block
    
    lexer grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    
    @namespace { My.Custom.NameSpace.For.Lexer }
    
    // rest of grammar follows
    ....
    
    Code Block
    
    parser grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    
    @namespace { My.Custom.NameSpace.For.Parser }
    
    // rest of grammar follows
    ....
    
    Code Block
    
    tree grammar MyGrammar;
    
    options
    {
        language=CSharp2;
    }
    
    @namespace { My.Custom.NameSpace.For.TreeParser }
    
    // rest of grammar follows
    ....
    

    Syntactic differences from the Java target

    The C# target uses language features like properties as the official coding guidelines which cause the general documentation to differ from what can be really used for the C# target. The rule of thumb is, that the attributes of rules are accessed with a capital letter at the beginning. Nonetheless there are exceptions so the goal is to have a comprehensive overview. If there are any errors, please fix them or send an email to the mailing list.

    ...

    Preprocessor symbols (since ANTLR 3.1.2)

    When generating grammars in debug mode the following preprocessor symbol is defined:

    Code Block
    #define ANTLR_DEBUG
    

    It can be use like any other preprocessor symbol.

    Members blocks vs C# v2 Partial Classes

    In addition to the the use of the @members block to define class members inline within the grammar file, the CSharp2 target also supports the use of the C# v2 partial classes feature. This has the additional advantage that you can use your favourite editor for C# to define class members since ANTLRworks doesn't support syntax highlighting for target languages.

    Also using a partial class allows to define the same using alias-directives with different classes than the ones defined in the generated code, as aliases are confined to the current file. If you use this feature, beware to point this difference out!

    Syntactic differences compared to grammars and the Java target

    ANTLR grammars refer often to attributes of the tokens and rules. These attributes do not change because their use in a grammar targeted for C#. But when accessing those attributes from driver programs and other support files, the names used in the ANTLR grammar won't work. The C# target uses the official coding guidelines which are different from the general documentation.The following table details the status quo for ANTLR v3.1.

    ANTLR

    C# Syntax

    Notes

    text

    Text

     

    start

    Start
    Untested


    stop

    Stop

    Untested  

    tree
    tree

    Tree


    st
    st

    ST

     

    type

    Type

    Untested


    line

    Line Untested


    pos

    Pos

    Untested CharPositionInLine


    channel

    Channel

    Untested


    $x.size()

    $x.Count

     

    ...

    size() is no grammar attribute, but still regularly used

    Known Issues

    Wrong Default Initialization

    A problem, you may encounter while using the CSharp target, is that value types are initialized with null in the generated code (happens e.g. while using labels). The cause lies in the following definition of CSharp.stg:

    Code Block
    
    csharpTypeInitMap ::= [
        "int":"0",
        "uint":"0",
        "long":"0",
        "ulong":"0",
        "float":"0.0",
        "double":"0.0",
        "bool":"false",
        "byte":"0",
        "sbyte":"0",
        "short":"0",
        "ushort":"0",
        "char":"char.MinValue",
        default:"null" // anything other than an atomic type
    ]
    

    As you can see, only the inbuilt in-built value types are supported (or can be reasonably supported). As Since adding value types to this map is an open-ended task, the maintainer does not make any changes in that structure for all users. Any changes have to be done by the user locally (and repeatedly for each new used version of ANTLR). It is recommend to that users switch to the CSharp2 target (which requires C# v2+ and use .NET 2 v2.0 or higher as target platforms, ) as the problem has been fixed there in an environment-independent manner.

    Exceptions not trapped in Visual Studio

    Shawn Poulson noticed that while debugging in VS, exceptions generated in the ANTLR runtime were not trapped by a catch for all Exceptions. He discovered that this is actually a byproduct of a default profile setting in Visual Studio 2005 and 2008. There is a feature that allows you to break on exceptions thrown and handled, regardless if there is a handler to catch the exception and tell you it was unhandled anyway. He found some discussion about this at this page.

    This behavior can be disabled two ways:

    • Go to Tools|Options|Debugging|General and uncheck "Enable Just My Code"
    • Go to Debugger|Exceptions and uncheck "User-handled" in the CLR row

    He was also missing the Debugger|Exceptions menu item, which has been fixed by adjusting the profile, which had this option disabled by default. This is shown here. Once this was sorted out, ANTLR works perfectly when parsing invalid parser/lexer input.

    Debugging with ANTLRWorks

    The ANTLRWorks tool is written in Java and is only able to debug Java recognizers directlynatively. Nevertheless, you can debug your C# recognizers with ANTLRWorks by using the Remote Debugging feature of ANTLRWorks. In ANTLRWorks, Remote Debugging works by connectiong to a running instance of a debug-instrumented recognizer (generated with the -debug switch to ANTLR) over the network.

    ...

    1. Generate a debuggable version of your recognizer by specifying the -debug option to ANTLR
    2. Create a driver program that creates your recognizer and runs some test input through it (see the examples-v3 archive for sample driver programs)
    3. Compile your driver and recognizer to produce your executable file(s)
    4. Execute your driver program (it will launch your recognizer and appear to hang - it's just waiting for ANTLRWorks to connect)
    5. Start ANTLRWorks (or switch to it if it is already running) and click the menu Debugger|Debug Remote...
    6. Click Connect to accept the default host and port values (localhost and 49153 respectively)
    7. ANTLRWorks should now start debugging your recognizer!
      Warning
      titleWarning

      ANTLRWorks remote debugging has only been tested for C# Parsers. TreeParsers and Lexers should work but...

      Note
      titleANTLRWorks and ANTLR version compatibility

      ANTLRWorks v1.1.x is only compatible with recognizers created with ANTLR v3.0.x. For recognizers created with ANTLR v3.1.x, you will need ANTLRWorks v1.2.x.

    ...