English 中文(简体)
在XSL转变期间,根据特性等级分类的XML档案
原标题:Serialize XML file on the basis of Character Count during an XSL transformation
  • 时间:2012-05-19 06:49:44
  •  标签:
  • xslt

I have an XML document (A.xml) and it is being transformed to another XML document (B.xml), which is nothing but a replica of A.xml with an unique @id being added to each element belonging to B.xml. And this part is done.

现在,我谨实施一个机制,在<条码>、xml内(在临时树林内)并在<条码>上(最大特性代码的基础上,将<条码>分为一个或多个部分的上。

<>Source XML Document>(Axml):

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <!--
    Rules for splitting:
    1. «head/text()» is common for all splits.
    2. split files can have 600 characters max each.
    3. «title» elements could not be the last element of the any result document.
    -->
    <head><!-- 8 characters -->Kinesics</head>
    <section>
        <para><!-- 37 characters -->From Wikipedia, the free encyclopedia</para>
        <para><!-- 204 characters [space normalized]-->Kinesics is the interpretation of body
            language such as facial expressions and gestures — or, more formally, non-verbal
            behavior related to movement, either of any part of the body or the body as a
            whole. </para>
        <section>
            <title><!-- 19 characters -->Birdwhistell s work</title>
            <para><!-- 432 characters [space normalized]-->The term was first used (in 1952) by Ray
                Birdwhistell, an anthropologist who wished to study how people communicate through
                posture, gesture, stance, and movement. Part of Birdwhistell s work involved making
                film of people in social situations and analyzing them to show different levels of
                communication not clearly seen otherwise. The study was joined by several other
                anthropologists, including Margaret Mead and Gregory Bateson.</para>
            <para><!-- 453 characters [space normalized]--> Drawing heavily on descriptive
                linguistics, Birdwhistell argued that all movements of the body have meaning (i.e.
                are not accidental), and that these non-verbal forms of language (or paralanguage)
                have a grammar that can be analyzed in similar terms to spoken language. Thus, a
                "kineme" is "similar to a phoneme because it consists of a group of movements which
                are not identical, but which may be used interchangeably without affecting social
                meaning".</para>
        </section>
        <section>
            <title><!-- 19 characters -->Modern applications</title>
            <para><!-- 390 characters [space normalized]-->Kinesics are an important part of
                non-verbal communication behavior. The movement of the body, or separate parts,
                conveys many specific meanings and the interpretations may be culture bound. As many
                movements are carried out at a subconscious or at least a low-awareness level,
                kinesic movements carry a significant risk of being misinterpreted in an
                intercultural communications situation.</para>
        </section>
    </section>
</root>

<><><>>>>XSL 文件。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:output method="xml" encoding="UTF-8" indent="no"/>

    <!--update 1-->
    <xsl:strip-space elements="*"/>

    <xsl:template match="/">
        <xsl:variable name="root-replica">
            <xsl:call-template name="create-root-replica">
                <xsl:with-param name="context" select="*"/>
            </xsl:call-template>
        </xsl:variable>
        <xsl:copy-of select="$root-replica"/>
        <!--
            <xsl:call-template name="split-n-serialize">
            <xsl:with-param name="context" select="$root-replica"/>
            </xsl:call-template>
        -->
    </xsl:template>

    <xsl:template name="split-n-serialize">
        <xsl:param name="context"/>
        <xsl:for-each select="$context">
            <xsl:result-document encoding="utf-8" href="{concat( split_ ,position(), .xml )}" method="xml" indent="no">
                <xsl:sequence select="."/>
            </xsl:result-document>
        </xsl:for-each>
    </xsl:template>

    <xsl:template name="create-root-replica">
        <xsl:param name="context"/>
        <root>
            <head>
                <xsl:value-of select="$context/head"/>
            </head>
            <xsl:apply-templates select="$context/*[not(self::head)]"/>
        </root>
    </xsl:template>

    <xsl:template match="element()">
        <xsl:element name="{local-name()}">
            <xsl:attribute name="id">
                <xsl:value-of select="generate-id()"/>
            </xsl:attribute>
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template>

    <!--update 2-->
    <xsl:template match="text()">
        <xsl:value-of select="normalize-space(.)"/>
    </xsl:template>

</xsl:transform>

我的投入XML包含1562。 特性(假设s+等于,我谨将A.xml分为4部分,使用源xml文件中提及的规则。

是否有人对此有何想法? 任何想法或评论都受到高度赞赏。

Update 3

http://www.un.org

1st File
       8
      37
     204  =  249
2nd File
       8
      19
     432  =  459
3rd File
       8
     453  =  461
4th File
       8
      19
     390  =  417

http://www.un.org/Depts/DGACM/index_french.htm

  1. Contents of element «head» should part of each and every XML file.

  2. 档案可以从章节的中间部分分开,但不能放在一段的中间部分。

  3. 不应在“名称”部分的末尾。

  4. 零件档案中的最高编号(不包括开端和结关)最高为600。

萨摩亚产出文档(斜体用于更好的阅读能力)

<><>><><>t>

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <head>Kinesics</head>
    <section id="d1e6">
        <para id="d1e7">From Wikipedia, the free encyclopedia</para>
        <para id="d1e10">Kinesics is the interpretation of body language such as facial expressions and gestures — or, more formally, non-verbal behavior related to movement, either of any part of the body or the body as a whole.</para>
    </section>
</root>

<><><>><>>><>><>>><>>>>>

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <head>Kinesics</head>
    <section id="d1e6">
        <section id="d1e13">
            <title id="d1e14">Birdwhistell s work</title>
            <para id="d1e17">The term was first used (in 1952) by Ray Birdwhistell, an anthropologist who wished to study how people communicate through posture, gesture, stance, and movement. Part of Birdwhistell s work involved making film of people in social situations and analyzing them to show different levels of communication not clearly seen otherwise. The study was joined by several other anthropologists, including Margaret Mead and Gregory Bateson.</para>
        </section>
    </section>
</root>

<><>>>>> 文件。

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <head>Kinesics</head>
    <section id="d1e6">
        <section id="d1e13">
            <para id="d1e20">Drawing heavily on descriptive linguistics, Birdwhistell argued that all movements of the body have meaning (i.e. are not accidental), and that these non-verbal forms of language (or paralanguage) have a grammar that can be analyzed in similar terms to spoken language. Thus, a "kineme" is "similar to a phoneme because it consists of a group of movements which are not identical, but which may be used interchangeably without affecting social meaning".</para>
        </section>
    </section>
</root>

<><><><>4th file<>>>

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <head>Kinesics</head>
    <section id="d1e6">
        <section id="d1e23">
            <title id="d1e24">Modern applications</title>
            <para id="d1e27">Kinesics are an important part of non-verbal communication behavior. The movement of the body, or separate parts, conveys many specific meanings and the interpretations may be culture bound. As many movements are carried out at a subconscious or at least a low-awareness level, kinesic movements carry a significant risk of being misinterpreted in an intercultural communications situation.</para>
        </section>
    </section>
</root>
问题回答




相关问题
When test hanging in an infinite loop

I m tokenising a string with XSLT 1.0 and trying to prevent empty strings from being recognised as tokens. Here s the entire function, based on XSLT Cookbook: <xsl:template name="tokenize"> ...

quick xslt for-each question

Let s say I have an XML document that has this: <keywords> <keyword>test</keyword> <keyword>test2</keyword> <keyword>test3</keyword> <keyword>test4</...

XSLT Transform XML with Namespaces

I m trying to transform some XML into HTML using XSLT. Problem: I can t get it to work. Can someone tell me what I m doing wrong? XML: <ArrayOfBrokerage xmlns:i="http://www.w3.org/2001/...

XSLT output to HTML

In my XSLT file, I have the following: <input type="button" value= <xsl:value-of select="name">> It s an error as it violates XML rule. What I actually want is having a value from an ...

Mangling IDs and References to IDs in XML

I m trying to compose xml elements into each other, and the problem I am having is when there s the same IDs. Basically what I need to do is mangle all the IDs in an xml file, as well as the ...

Sharepoint 2007 Data view Webpart custom parameters

I m sort of new to the custom parameters that can be setup on a DataView Webpart. There are 6 options: - None - Control - Cookie - Form - QueryString - Server Variable I think that None, Cookie and ...

热门标签