Incremental Parsing Using the Consumer API
November 5, 2004 | Fredrik Lundh
The ElementTree library provides several ways to parse XML documents.
The most common way is to read the document from a file or an input stream, using the parse function:
from elementtree import ElementTree tree = ElementTree.parse("document.xml") root = tree.getroot()
Alternatively, you can create an empty ElementTree instance, and use the parse method to load a document into it:
from elementtree import ElementTree tree = ElementTree.ElementTree() tree.parse("document.xml") root = tree.getroot()
The XML helper can be used to create an XML document from a string buffer (or a string literal):
from elementtree import ElementTree root = ElementTree.XML("<document>body</document>")
You can also use the parser and tree builder components directly, to get more control over the document build process. The core XML parser component is called XMLTreeBuilder. This class implements the standard consumer interface, which lets you feed data to the parser, piece by piece:
from elementtree import ElementTree parser = ElementTree.XMLTreeBuilder() parser.feed("<document>") parser.feed("body") parser.feed("</docu") parser.feed("ment>") root = parser.close()
The pieces can be of any size, and tags and entities can be spread over multiple pieces.
Note that the close method returns the resulting document root (as an Element instance). If you want an ElementTree, just wrap it as usual:
from elementtree import ElementTree parser = ElementTree.XMLTreeBuilder() parser.feed("<document>bo") parser.feed("dy</document>") tree = ElementTree.ElementTree(parser.close())