English 中文(简体)
Large XML Files and Pagination, is it possible?

The problem

When opening very large XML files locally, on your machine, it s almost a certainty that it will take an age for that file to open - it can often mean your computer locks down because it thinks it s not responding.

This is an issue if you serve users XML backups of rather complex databases or systems they use - the likehood of them being able to open large backups, let alone use them, is slim.

Is pagination possible?

I use XSLT to present readable backups to users. In this same way, would it be possible to pull only a page at a time of data, to prevent the entire file from being read in one go, thus causing the issues above.

I imagine the answer is simply a no - but I would like to know if anyone else has seen the same issues and resolved them.

Note: This is on a local machine only, it must not require an internet connection. JavaScript can be used if it makes things easier.


Pagination with XSLT is possible, but will probably not lead to the desired results: For XSLT to work, the whole XML document must be parsed into a DOM tree.

What you could do, is experiment with streaming transformations: http://stx.sourceforge.net/

Or you could preprocess the large XML file to cut it up into smaller bits before processing with XSLT. For this I d use a command line tool like XMLStarlet


Right on, very good question!

XSLT implementations I know require DOM, so they are bound to access the entire document (although it could perhaps be done in a lazy fashion)

Anyway, you should take a look at VTD-XML: http://vtd-xml.sourceforge.net/

The latest SAXON XSLT processor also supports rudimentary support for what is called "Streaming XSLT". Read about that here: http://www.saxonica.com/documentation/index/intro.html

That said, database backups are probably not the right use case for XML. If you have to deal with XML database backups, I would try to get away from those as fast as possible. Same for logs - a linear process should work by simply appending things. I mean, it would be even better of XML would allow a forest as top level structure, but I think that is never going to happen.

XMLMax Virtual xml editor will read, parse and display a 1 Gigabyte xml file in a treeview in about 30 seconds on a fast PC. Windows OS only. It will work with xml of any size or structure.

HI, i don t know what programing language you are using but in C# using XMLReader i can read the file tag by tag and not the whole file. This way you can read only the first page and stop the reading. Best Regards, Iordan

One way to alleviate this problem would be to split the large XML files into a number of smaller XML documents. Depending on the type of data you may split or partition the file any number of ways (i.e. Day, Transaction, Entity, etc)

This will introduce a number of other challenges of course. For instance you will have to come up with a specialized parser if you need to view the data as a whole or across partitions.

how to represent it in dtd?

I have two element action and guid. guid is a required field when action is add. but when action is del it will not appear in file. How to represent this in dtd ?

.Net application configuration add xml-data

I need to add xml-content to my application configuration file. Is there a way to add it directly to the appSettings section or do I need to implement a configSection? Is it possible to add the xml ...

XStream serializing collections

I have a class structure that I would like to serialize with Xstream. The root class contains a collection of other objects (of varying types). I would like to only serialize part of the objects that ...

MS Word splits words in its XML format

I have a Word 2003 document saved as a XML in WordProcessingML format. It contains few placeholders which will be dynamically replaced by an appropriate content. But, the problem is that Word ...

Merging an XML file with a list of changes

I have two XML files that are generated by another application I have no control over. The first is a settings file, and the second is a list of changes that should be applied to the first. Main ...

How do I check if a node has no siblings?

I have a org.w3c.dom.Node object. I would like to see if it has any other siblings. Here s what I have tried: Node sibling = node.getNextSibling(); if(sibling == null) return true; else ...

Ordering a hash to xml: Rails

I m building an xml document from a hash. The xml attributes need to be in order. How can this be accomplished? hash.to_xml
