Enforcing semantics

Errors we should trap together

Basics:

DONE 2/3/2010

A tree pattern matcher should work for all. record options on way down. We need a way to count elements too for things like repeated options spec.

  • FILE_AND_GRAMMAR_NAME_DIFFER
  • LEXER_RULES_NOT_ALLOWED
  • PARSER_RULES_NOT_ALLOWED
  • CANNOT_ALIAS_TOKENS_IN_LEXER
  • ARGS_ON_TOKEN_REF
  • ILLEGAL_OPTION
  • NO_RULES
  • REWRITE_FOR_MULTI_ELEMENT_ALT
  • HETERO_ILLEGAL_IN_REWRITE_ALT
  • AST_OP_WITH_NON_AST_OUTPUT_OPTION
  • AST_OP_IN_ALT_WITH_REWRITE
  • CONFLICTING_OPTION_IN_TREE_FILTER
  • WILDCARD_AS_ROOT
  • INVALID_IMPORT
  • TOKEN_VOCAB_IN_DELEGATE
  • IMPORT_NAME_CLASH(arg,arg2) ::= "<arg.typeString> grammar <arg.name> and imported <arg2.typeString> grammar <arg2.name> both generate <arg2.recognizerName>" If we are importing a grammar into a combined grammar C and imported grammar I, I must not be CLexer or CParser
  • REWRITE_OR_OP_WITH_NO_OUTPUT_OPTION
  • new errors:
    • REPEATED_PREQUEL; repeated options or tokens spec (since we allow in any order now)
    • TOKEN_NAMES_MUST_START_UPPER

Symbols:

purely symbol related

I think I can do all of these simply by collecting a list of symbols of the various types then doing some checks.

DONE 2/7/2010

  • RULE_REDEFINITION
  • ACTION_REDEFINITION
  • RULE_HAS_NO_ARGS
  • UNDEFINED_RULE_REF
  • MISSING_RULE_ARGS
  • SCOPE_REDEFINITION (new)
  • TOKEN_ALIAS_REASSIGNMENT
  • LABEL_CONFLICTS_WITH_RULE
  • LABEL_CONFLICTS_WITH_TOKEN
  • SYMBOL_CONFLICTS_WITH_GLOBAL_SCOPE (label v scope, token v scope, rule v scope)
  • LABEL_TYPE_CONFLICT

syntax related

DONE 2/8/2010

These need tree pattern matching to find subtrees with the right syntax. They need to check what's on the left-hand side of rewrites, so I also need to track a list of references within each alternative.

  • REWRITE_ELEMENT_NOT_PRESENT_ON_LHS (merged to include v3 UNDEFINED_LABEL_REF_IN_REWRITE, UNDEFINED_TOKEN_REF_IN_REWRITE)

Argument, return value, scope attribute issues

DONE 2/10/2010

  • LABEL_CONFLICTS_WITH_RULE_SCOPE_ATTRIBUTE
  • LABEL_CONFLICTS_WITH_RULE_ARG_RETVAL
  • ARG_RETVAL_CONFLICT (arg/ret val same name)
  • ATTRIBUTE_CONFLICTS_WITH_RULE, ATTRIBUTE_CONFLICTS_WITH_RULE_ARG_RETVAL
    (collision of a rule-scope dynamic attribute with arg, return value, rule name itself.)

Attribute refs in actions:

DONE (2/16/10):

  • UNKNOWN_SIMPLE_ATTRIBUTE(arg,args2) ::= "attribute is not a token, parameter, or return value: <arg>"
  • ATTRIBUTE_REF_NOT_IN_RULE(arg,arg2) ::= "reference to attribute outside of a rule: <arg><if(arg2)>.<arg2><endif>"
  • UNKNOWN_ATTRIBUTE_IN_SCOPE(arg,arg2) ::= "unknown attribute for <arg>: <arg2>"
  • UNKNOWN_RULE_ATTRIBUTE(arg,arg2) ::= "unknown attribute for rule <arg>: <arg2>"
  • ISOLATED_RULE_SCOPE(arg) ::= "missing attribute access on rule scope: <arg>"
  • INVALID_RULE_PARAMETER_REF(arg,arg2) ::= "cannot access rule <arg>'s parameter: <arg2>"
  • INVALID_RULE_SCOPE_ATTRIBUTE_REF(arg,arg2) ::= "cannot access rule <arg>'s dynamically-scoped attribute: <arg2>"
  • UNKNOWN_DYNAMIC_SCOPE(arg) ::= "unknown dynamic scope: <arg>"
  • UNKNOWN_DYNAMIC_SCOPE_ATTRIBUTE(arg,arg2) ::= "unknown dynamically-scoped attribute for scope <arg>: <arg2>"
  • ISOLATED_RULE_ATTRIBUTE(arg) ::= "reference to locally-defined rule scope attribute without rule name: <arg>"
  • INVALID_TEMPLATE_ACTION(arg) ::= "invalid StringTemplate % shorthand syntax: '<arg>'"

TODO:

  • NONUNIQUE_REF(arg) ::= "<arg> is a non-unique reference"
  • FORWARD_ELEMENT_REF(arg) ::= "illegal forward reference: <arg>"
  • ?? WRITE_TO_READONLY_ATTR(arg,arg2,arg3) ::= "cannot write to read only attribute: $<arg><if(arg2)>.<arg2><endif>"
  • ?? RULE_REF_AMBIG_WITH_RULE_IN_ALT

Imported grammars

Put into symbols checker.

DONE 2/17/2010

  • NO_SUCH_GRAMMAR_SCOPE(arg,arg2) ::= "reference to undefined grammar in rule reference: <arg>.<arg2>"
  • NO_SUCH_RULE_IN_SCOPE(arg,arg2) ::= "rule <arg2> is not defined in grammar <arg>"

Unsure:

  • RULE_INVALID_SET
  • TOKEN_ALIAS_CONFLICT (In combined only. Need to compare tokens def in combined with lexer rule in implicit lexer)
  • IMPORTED_TOKENS_RULE_EMPTY(arg,arg2) ::= "no lexer rules contributed to <arg> from imported grammar <arg2>" This is for unreachable alts when dfa is Tokens rule. This is more of an LL(star) decision thing.

Implementation notes

I have to say: tree pattern matchers kick ass! Instead of specifying an entire tree grammar or tree visitor, you specify the patterns of interest and then an action to execute if they match. This is like using an awk script. I'm using two of them so far as I start to build this v4. The first thing I use it for is to do the basic semantic checking. BasicSemanticTriggers.g has rules like this

delegateGrammar
    :   (	^(ASSIGN ID id=ID)
	    |   id=ID
	    )
	    {BasicSemanticChecks.checkImport(g, $id.token);}
    ;

Then in BasicSemanticChecks.checkImport(), I check to see that everything is okay with the import:

protected static void checkImport(Grammar g, Token importID) {
    Grammar delegate = g.getImportedGrammar(importID.getText());
    if ( delegate==null ) return;
    List<Integer> validDelegators = validImportTypes.get(delegate.getType());
    if ( validDelegators!=null && !validDelegators.contains(g.getType()) ) {
        ErrorManager.grammarError(ErrorType.INVALID_IMPORT,
                                  g.fileName,
                                  importID,
                                  g, delegate);
    }                                 
    if ( g.getType()==ANTLRParser.GRAMMAR &&
         (delegate.name.equals(g.name+Grammar.getGrammarTypeToFileNameSuffix(ANTLRParser.LEXER))||
          delegate.name.equals(g.name+Grammar.getGrammarTypeToFileNameSuffix(ANTLRParser.PARSER))) )
    {
        ErrorManager.grammarError(ErrorType.IMPORT_NAME_CLASH,
                                  g.fileName,
                                  importID,
                                  g, delegate);
    }                                 
}   

I love the fact that all of these checks are done up front instead of being entangled with all of the logic to define symbols all that, as I did in v3 implementation. Speaking of defining symbols, it's pretty easy to define rules and tokens. Here is the set of tree patterns I'm using to get started:

tokenAlias
    :	{inContext("TOKENS")}? ^(ASSIGN ID STRING_LITERAL)
	{System.out.println("token alias "+$ID.text+"="+$STRING_LITERAL.token);}
    ;

rule:   ^( RULE r=ID .*) {System.out.println("rule "+$r.token);}
    ;

terminal
    :	{!inContext("TOKENS ASSIGN")}? STRING_LITERAL
    	{System.out.println("terminal "+$STRING_LITERAL.token);}
    |	TOKEN_REF		{System.out.println("terminal "+$TOKEN_REF.token);}
    ;