English 中文(简体)
ANTLR4: 如何凌驾于弹性分则/分则的案文
原标题:ANTLR4: How to override text in lexer subrule/fragment
  • 时间:2023-07-11 02:16:04
  •  标签:
  • antlr
  • antlr4

The syntax I m trying to parse includes a continuation indicator in column 71. Identifiers, literals, almost anything can be continued onto the next line.

Ideally, I would like to drop the characters which make up the continue token, so that I m left with only the identifier characters. However, using the following lexer rules, the setText("") in LINE_CONTINUATION is ignored, thus polluting the final IDENTIFIER token.

IDENTIFIER 
    : 
    {getCharPositionInLine() < 71 }? IDENTIFIER_PART
    (
            {getCharPositionInLine() < 71 }? IDENTIFIER_PART  
        |   LINE_CONTINUATION 
    )*
;
fragment IDENTIFIER_PART: (LETTER|DIGIT| _ );
fragment DIGIT: [0-9];
fragment LETTER options { caseInsensitive=true; } : [A-Z];

//A continuation line is non-blank in column 72, followed by anything until EOL,
//then on next line the characters starting after column position 15
LINE_CONTINUATION
    : 
    {getCharPositionInLine() == 71 }? 
    ~[ ] 
    ~[
]* EOL
    ({getCharPositionInLine() <= 15 }? [ ] )+  
    {setText("");}
; 

Is there anyway of overriding the value of a subrule (or fragment) in the same way that root rules can be overridden?

例如,可以列出一个识别标志清单,其定义是:

AAAAAAAAAAAA,BBBBBBBBBBB,CCCCCCCCCCCCCCCCC,DDDDDDDDDDD,EEEEEEEEEE,FFFF* Some comment
FFFF,GGGGGGGG

I m试图用文字表示:

AAAAAAAAAAAA
BBBBBBBBBBB
CCCCCCCCCCCCCCCCC
DDDDDDDDDDD
EEEEEEEEEE
FFFFFFFF
GGGGGGGG

然而,我收到了:

AAAAAAAAAAAA
BBBBBBBBBBB
CCCCCCCCCCCCCCCCC
DDDDDDDDDDD
EEEEEEEEEE
FFFF* Some comment
FFFF
GGGGGGGG
问题回答

这是不可能的。 页: 1 试验:

IDENTIFIER
 : {getCharPositionInLine() < 71 }? IDENTIFIER_PART
   ( {getCharPositionInLine() < 71 }? IDENTIFIER_PART  
   | LINE_CONTINUATION 
   )*
   {
     String text = getText();
     setText(text.replaceAll(“\S[^
]*[
]+[ ]{0,15}”, “”));
   }
;




相关问题
ANTLR parser hanging at proxy.handshake call

I am attempting to get a basic ECMAScript parser working, and found a complete ANTLR grammar for ECMAScript 3, which appears to compile ok and produces the appropriate Lexer/Parser/Walker Java files. (...

Will ANTLR Help? Different Suggestion?

Before I dive into ANTLR (because it is apparently not for the faint of heart), I just want to make sure I have made the right decision regarding its usage. I want to create a grammar that will parse ...

How to use ANTLR to parse xml document

can anybody tell how to use ANTLR tool(in java) to create our own grammar for xml documents and how to parse those documents using ANTLR tool(in java)?

JavaCC Problem - Generated code doesn t find all parse errors

Just started with JavaCC. But I have a strange behaviour with it. I want to verify input int the form of tokens (letters and numbers) wich are concatenated with signs (+, -, /) and wich can contain ...

How to generate introductory recognizer using ANTLR3C?

The Definitive ANTLR Guide starts with a simple recognizer. Using grammar verbatim to target C-runtime fails because %s means something to ANTLR: $ cat T.g grammar T; options { language = ...

What s the matter with this Grammar?

grammar Test; IDHEAD: ( a .. z | A .. Z | _ ); IDTAIL: (IDHEAD | 0 .. 9 ); ID: (IDHEAD IDTAIL*); fragment TYPE: ( text | number | bool ); define: define ID as TYPE; The problem ...

热门标签