Is anybody familiar with the the RTF document format and parsing using any Java libaries. The standard way people have done this is by using the RTFEditorKit in the JDK Swing API:
Swing RTFEditorKit API 的中文翻译为摇摆 RTFEditorKit API。
但是,当涉及到解析 RTF 文档时,它并不那么准确。实际上,API 中有一条注释:
The RTF support was not written by the Swing team. In the future we hope to improve the support provided.
I don t think I m going to wait for this to happen :)
另一种方法是使用JavaCC定义语法并生成解析器。这样做效果更好,但我找不到完整的语法。我已经尝试过:
which is ok and the following (which is the best so far).
Koders RTFParserDelegate and ETranslate Grammar
有各种不同的 ETranslate 语法实现(我知道 Nutch API 可能会使用它)。有没有人知道哪种语法最准确,或者是否有更好的方法?
I could start ploughing through the JavaCC docs to understand the .jj files and test it against the RTF files... this is my current approach, but it s taking a while... any help would be appreciated