English 中文(简体)
如何使用XML DOM API去每一个非文本节点?
原标题:
  • 时间:2009-04-06 08:04:19
  •  标签:

我的新XML DOM。我想我需要使用DOM API找到穿过每一个非文本节点,和输出节点名。

说我这个例子从W3C XML

<bookstore>

<book category="cooking">
 <title lang="en">Everyday Italian</title>
 <author>Giada De Laurentiis</author>
 <year>2005</year>
 <price>30.00</price>
 <page pagenumber="550"/>
</book>

<book category="children">
 <title lang="en">Harry Potter</title>
 <author>J K. Rowling</author>
 <year>2005</year>
 <price>29.99</price>
 <page pagenumber="500"/>
</book>
</bookstore>

我需要找到节点如<代码> <页面pagenumber = " 500 " / > < /代码>这是一种非文本节点

我怎么能这样做呢?seduo-code也会没事的。谢谢

我能说

 while (x.nodeValue == NULL) {
   read the next node ?
}

我想我应该说清楚了,没有在任何docuemnts假设。这应该工作在所有XML只要有一个非文本节点。我想这应该从自上而下,从左到右的顺序对每一个节点。:(

最佳回答

XPATH ="//*[not(text())]"
Will select all nodes which are non-text node.
Here in the given example: bookstore and book are also non-text nodes as they does not have any text of their own, though their children do have text.

问题回答

基本上你的问题似乎是:<强>给定的XML文档,我怎么发现没有任何子节点的文本内容。< /强>

一个简单的XPath表达式,如:

/bookst或e/book/*[count(child::text()) = 0]

/bookst或e/book/*[not(text())]

will do it f或 you. Applying this XPath expression on the sample document will return a node-set containing both the page elements. You do not have to know the name of the page element bef或ehand, 或 even the names of all possible child elements of the book element, as you can see.

To explain: You need to query f或 child-nodes of the book element that do not contain ANY textual child nodes. The child::* axis represents all child nodes of the current node and the text() node-type restricts the processed node types to those that contain textual content.

Edit: Note that if you want to query f或 non-text nodes in any XML document (as per your latest edit to the question), you should choose the answer provided by nils_gate. My answer was given pri或 to your edit and illustrates the concept, rather than providing a generic solution.

你知道你需要找到的节点?如果你知道它年代:

  • A page element
  • It has a pagenumber attribute with value 500

XPath是前进的道路(假设它年代可在您的平台——你还没指定以外的“DOM”;大多数DOM实现包括XPath据我已经见过)。

在这种情况下,你d使用XPath的:

//page[@pagenumber= 500 ]

如果你可以使用XPath,请解释DOM API你使用,我们可以尝试想出最好的解决方案。基本上你会最终可能遍历每一个元素节点,检查它的名字叫<代码>页面是否> < /代码,然后检查是否有一个适当的<代码> pagenumber < /代码>属性值。

看起来你需要XPath。W3学校网站< a href = " http://www.w3schools.com/xpath/xpath_syntax.asp " rel = " nofollow noreferrer " >一个好的参考< / >,但是,假设节点总是出现在一个节点,XPath <代码> > < /代码/书店/书/页面将返回一个节点集,每个节点。<代码> /书店/ book /页面[@pagenumber = 500] > < /代码会得到每个节点pagenumber属性的值为500。

<代码> / / <代码>语法会发现节点在文档的任何地方而不用担心结构——这可以容易但慢,尤其是大型文档。如果你与一个已知的文档结构,最好年代使用显式的XPath。





相关问题
热门标签