English 中文(简体)
使用文档构建器“ 分隔” 解析 XHTML 时无限制循环
原标题:Infinite loop while parsing XHTML using DocumentBuilder "parse"

我拥有从 java.io.InputStream 返回 org.w3c.dom.Document 返回的 XHTML 文档装入该文档的方法。

private Document loadDocFrom(InputStream is) throws SAXException,
        IOException, ParserConfigurationException {
    DocumentBuilderFactory domFactory = DocumentBuilderFactory
            .newInstance();
    domFactory.setNamespaceAware(true); // never forget this
    DocumentBuilder builder = domFactory.newDocumentBuilder();

    Document doc = builder.parse(is);
    is.close();
    return doc;
}

这个方法行之有效,我用一些 XHTML 文件(例如 < code> http://pastebin.com/L2kHwggU )和 XHTML 网站测试了它。

但是,对于某些文件,例如“http://pastebin.com/v675yWSJ” rel=“nofollow”>>http://pastebin.com/v675yWSJ ,甚至像www.w3.org 这样的网站,在Document doc=builder.parse(is); 进入无限循环。

<强度 > EDIT:

@Michael Kay发现了问题, 但我在等待他的解决方案。

其他可能的解决办法之一是无视《多哈发展宣言》:

domFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)

谢谢你的帮助

最佳回答

我认为你的诊断是无穷无尽的循环是不正确的;它只是需要很长的时间,这是不一样的。

通常的原因是,该文件在W3C网站上提及了XHTML DTD, 采集者将去网络获取这个文件,而不是使用本地副本。 大约一年前,W3C开始对这些共同的DTD提出“扼杀”要求,因为他们无法再处理流量问题。

通常的解决办法是使用解析者将请求改到本地副本。

问题回答

暂无回答




相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签