English 中文(简体)
Define comment and quote in flex and bison and lexical errors
原标题:

I have to create a lexical and syntax analyzer for a c-like language. In this language we define as comment "everything that exists after the symbol % until the end of line". Are the following declarations correct?

Flex
...
[%][^
]*[
]  { return T_COMMENT; }
[
]   { return T_NEWLINE; }

Bison
...
comment:com text newline;
text: |name text|digit text;

...
com: T_COMMENT   { printf("%s",yytext); };
newline: T_NEWLINE  { printf("%s",yytext); };

I also need to define the quote symbol ". Is the following correct (flex)?

"""   { return T_QUOTE; }

There is no compile error in the flex and bison input files but when I use a program written in this c-like language as a test input I get as a result lexical error in line 1. There is no lexical error in this line. My program has to start with like this: PROGRAM name_of_program and a compalsory new line I make the following declarations: Flex

"PROGRAM"  { return T_PROGRAM; }

Bison

%start programma
%token T_PROGRAM
...
programma:PROGRAM name newline function STARTMAIN dec_var command ENDMAIN eof;
...
PROGRAM: T_PROGRAM  { printf("%s",yytext); };
...

(words in upper case are defined like PROGRAM as they are part of the language) Do I write anything wrong? I think that the problem is with newline definition but I am not sure.

Thank you in advance for any answer. Sorry for the long post.

问题回答

Generally, comments are handled by the lexer and not passed to the parser. If your language is truly C-like, then in most cases a newline should be treated like any other whitespace. Comments and quoted strings are the notable exceptions. Quoted strings are usually captured by the lexer using start states and passed to the parser whole.

Your flex code uses character sets too much. You don t need to make a set if you only want to match one particular character; just put the character, with a backslash escape if needed. Additionally, . means any non-newline character.

Also, you don t have any definition for the name_of_program token. Assuming it is a C-style identifier, you can declare an identifier pattern and token in flex and pass it up to bison.

Finally, you might want to adopt the naming convention of using all caps for tokens passed to bison from flex, and lowercase for tokens used within bison.

So, from what you ve described, I have the following:

example.l:

%%

\%.* /* comment */

 { return T_NEWLINE; }
  { return T_QUOTE; }
PROGRAM { return T_PROGRAM; }
[A-Za-z_][A-Za-z0-9_]* { yylval.id = yytext; return T_IDENTIFIER; }

%%

example.y:

%%

programma: T_PROGRAM T_IDENTIFIER T_NEWLINE function STARTMAIN dec_var command ENDMAIN eof;

text: 
    | name text
    | digit text;

%%

I m not sure you need the eof token in there.

I hope this helps.





相关问题
VS 10 mangles html comments?

Using the latest VS 10 we created html markup, then commented it with html comments. The file on disk is not mangled, but when it renders, it renders the html comment tags, then removes some of the ...

Comments & OpenSource software [closed]

This may sound like a foolish question or an observation, but i have seen that most of the times when one tries to look at the opensource code, there are no comments or just one or two lines at the ...

Including commented Class declaration in implementation file

Everyone knows the advantages of a more readable code. So in order to make my code more readable what i do normally is include the commented class declaration in the implementation file of that class....

pair programming with comments [closed]

Over the years, I ve discovered that green-programmers tend to read the comments rather than the code to debug issues. Does having one person document the other person s code (and vice-versa) with ...

Commenting JavaScript functions á la Python Docstrings

It is valid JavaScript to write something like this: function example(x) { "Here is a short doc what I do."; // code of the function } The string actually does nothing. Is there any reason, ...

热门标签