Options
There are a number of options that you can specify at the grammar and rule element level (there are currently no rule options.) These change how antlr-ng generates code from your grammar. The general syntax is:
options { name1 = value1; ... nameN = valueN; };
where a value can be an identifier, a qualified identifier (for example, a.b.c), a string, and an integer.
Grammar Options
All grammars can use the following options. In combined grammars, all options except language pertain only to the generated parser. Options may be set either within the grammar file using the options syntax (described above) or when invoking antlr-ng on the command line, using the -D
option (see the Getting Started Page.) The following examples demonstrate both mechanisms; note that -D
overrides options within the grammar.
superClass
Set the superclass of the generated parser or lexer. For combined grammars, it sets the superclass of the parser.
grammar Hi;
a : 'hi' ;
// Run generation: antlr-ng -DsuperClass=XX Hi.g4
// which produces this:
class HiParser extends XX {`
language
Generate code in the indicated language, if antlr-ng is able to do so. Otherwise, you will see an error message like this:
$ antlr-ng -Dlanguage=C MyGrammar.g4
error(31): antlr-ng cannot generate C code as of version 1.0
tokenVocab
ANTLR assigns token type numbers to the tokens as it encounters them in a file. To use different token type values, such as with a separate lexer, use this option to have antlr-ng pull in the tokens
file. antlr-ng generates a tokens
file from each grammar.
lexer grammar SomeLexer;
ID : [a-z]+;
parser grammar R;
options {tokenVocab = SomeLexer;}
tokens {A, B, C} // normally, these would be token types 1, 2, 3
a : ID ;
// Lexer tokens:
ID=1
// Parser tokens:
A=2
B=3
C=4
ID=1
TokenLabelType
ANTLR normally uses type Token
when it generates variables referencing tokens. If you have passed a TokenFactory
to your parser and lexer so that they create custom tokens, you should set this option to your specific type. This ensures that the context objects know your type for fields and method return values.
grammar T2;
options {TokenLabelType = MyToken;}
a: x=ID ;
// TokenLabelType is used for x in T2Parser.
public MyToken x;
contextSuperClass
Specify the super class of parse tree internal nodes. Default is ParserRuleContext
. Should derive from ultimately ParserRuleContext
at minimum. You can use contextSuperClass=RuleContextWithAltNum
for convenience. It adds a backing field for altNumber
, the alt matched for the associated rule node.
caseInsensitive
antlr-ng supports case-insensitive lexers using a grammar option. For example, the parser from the following grammar:
lexer grammar L;
options {caseInsensitive = true; }
ENGLISH_TOKEN: [a-z]+;
GERMAN_TOKEN: [äéöüß]+;
FRENCH_TOKEN: [àâæ-ëîïôœùûüÿ]+;
CROATIAN_TOKEN: [ćčđšž]+;
ITALIAN_TOKEN: [àèéìòù]+;
SPANISH_TOKEN: [áéíñóúü¡¿]+;
GREEK_TOKEN: [α-ω]+;
RUSSIAN_TOKEN: [а-я]+;
WS: [ ]+ -> skip;
matches words such as the following:
abcXYZ äéöüßÄÉÖÜß àâæçÙÛÜŸ ćčđĐŠŽ àèéÌÒÙ áéÚÜ¡¿ αβγΧΨΩ абвЭЮЯ
ANTLR considers only one-length chars in all cases. For instance, german lower ß
is not treated as upper ss
and vice versa.
The mechanism works by automatically transforming grammar references to characters to there upper/lower case equivalent; e.g., a
to [aA]
. This means that you do not need to convert your input characters to uppercase–token text will be as it appears in the input stream.
Rule Options
caseInsensitive
The tool support caseInsensitive
lexer rule option that is described in Lexer Rules Page.
Rule Element Options
Token options have the form <name=value>
. The only token option is assoc
, and it accepts values left
and right
. Here’s a sample grammar with a left-recursive expression rule that specifies a token option on the ^
exponent operator token:
grammar ExprLR;
expr: expr '^'<assoc=right> expr
| expr '*' expr // match subexpressions joined with '*' operator
| expr '+' expr // match subexpressions joined with '+' operator
| INT // matches simple integer atom
;
INT : '0'..'9'+ ;
WS : [ \\n]+ -> skip ;
Semantic predicates also accept an option. The only valid option is the fail
option, which takes either a string literal in double-quotes or an action that evaluates to a string. The string literal or string result from the action should be the message to emit upon predicate failure.
ints[int max]
locals [int i = 1]
: INT ( ',' {$i++;} {$i & lt;=$max}? <fail={"exceeded max " + $max}> INT )*
;
The action can execute a function as well as compute a string when a predicate fails:
{...}?<fail={doSomethingAndReturnAString()}>