ElementTree Overview

  

ElementTree Overview

But I have found that sitting under the ElementTree, one can feel the Zen of XML.
— Essien Ita Essien

Update 2007-09-12: ElementTree 1.3 alpha 3 is now available. For more information, see Introducing ElementTree 1.3.

Update 2007-08-27: ElementTree 1.2.7 preview is now available. This is 1.2.6 plus support for IronPython. The serializer is ~20% faster, and now supports newlines in attribute values.

The Element type is a simple but flexible container object, designed to store hierarchical data structures, such as simplified XML infosets, in memory. The element type can be described as a cross between a Python list and a Python dictionary.

The ElementTree wrapper type adds code to load XML files as trees of Element objects, and save them back again.

The Element type is available as a pure-Python implementation for Python 1.5.2 and later. A C implementation is also available, for use with CPython 2.1 and later. The core components of both libraries are also shipped with Python 2.5 and later.

There’s also an independent implementation, lxml.etree, based on the well-known libxml2/libxslt libraries. This adds full support for XSLT, XPath, and more.

For more implementations and add-ons, see the Interesting Stuff section below.

[usage] [documentation] [api reference]

Installation #

Binary installers are available for many platforms, including Windows, Mac OS X, and most Linux distributions. Look for packages named “python-elementtree” or similar.

To install from source, simply unpack the distribution archive, change to the distribution directory, and run the setup.py script as follows:

$ python setup.py install

When you’ve done this, you should be able to import the ElementTree module, and other modules from the elementtree package:

$ python
>>> from elementtree import ElementTree

It’s common practice to import ElementTree under an alias, both to minimize typing, and to make it easier to switch between different implementations:

$ python
>>> import elementtree.ElementTree as ET
>>> import cElementTree as ET
>>> import lxml.etree as ET
>>> import xml.etree.ElementTree as ET # Python 2.5

Note that if you only need the core functionality, you can include the ElementTree.py file in your own project. To get path support, you also need ElementPath.py. All other modules are optional.

Basic Usage #

Each Element instance can have an identifying tag, any number of attributes, any number of child element instances, and an associated object (usually a string). To create elements, you can use the Element or Subelement factories:

import elementtree.ElementTree as ET

# build a tree structure
root = ET.Element("html")

head = ET.SubElement(root, "head")

title = ET.SubElement(head, "title")
title.text = "Page Title"

body = ET.SubElement(root, "body")
body.set("bgcolor", "#ffffff")

body.text = "Hello, World!"

# wrap it in an ElementTree instance, and save as XML
tree = ET.ElementTree(root)
tree.write("page.xhtml")

The ElementTree wrapper adds code to load XML files as trees of Element objects, and save them back again. You can use the parse function to quickly load an entire XML document into an ElementTree instance:

import elementtree.ElementTree as ET

tree = ET.parse("page.xhtml")

# the tree root is the toplevel html element
print tree.findtext("head/title")

# if you need the root element, use getroot
root = tree.getroot()

# ...manipulate tree...

tree.write("out.xml")

For more details, see Elements and Element Trees.

Documentation #

Zone articles:

Elements and Element Trees (brief tutorial)The elementtree.ElementTree Module (reference page)Element Tree InfosetsThe ElementTree iterparse FunctionIncremental Parsing Using the Consumer APIElement Library FunctionsElementTree: Bits and Pieces (useful helpers)SimpleXMLWriter

The cElementTree Module

XPath Support in ElementTreeXInclude support in ElementTree 1.2

ElementTree Tidy HTML Tree Builder

Using the ElementTree Module to Generate SOAP Messages

Elsewhere:

Andrew Dalke: IterParseFilter: XPath-like filtering of ElementTree’s iterparse event stream

Andrew Dalke: PyProtocols for output generation

Martijn Faassen: lxml and (c)ElementTree

Andrew Kuchling: Processing XML with ElementTree [slides from a talk]

Danny Yoo: ElementTree mini-tutorial [“Let’s work through a small example with it; that may help to clear some confusion.“]

Joseph Reagle: XML ElementTree Data Model

Uche Ogbuji: Simple XML Processing With elementtree [xml.com]

David Mertz: Process XML in Python with ElementTree: How does the API stack up against similar libraries? [ibm developerworks]

Uche Ogbuji: Python Paradigms for XML

Uche Ogbuji: XML Namespaces Support in Python Tools, Part Three [xml.com]

Uche Ogbuji: Practical SAX Notes: ElementTree, Namespaces and Techniques for Large Documents [xml.com]

Interesting stuff built with (or for) ElementTree (selection):

L. C. Rees: webstring (webstring is a web templating engine that allows programs to manipulate XML and HTML documents with standard Python sequence and string operators. It is designed for those whose preferred web template languages are Python and HTML (and XML for people who swing that way).

Chris McDonough: meld3 (an XML templating system for Python 2.3+ which keeps template markup and dynamic rendering logic separate from one another, based on PyMeld)

Peter Hunt: pymeld4 (another ET-based implementation of the PyMeld templating language)

Seo Sanghyeon: pyexpat/ElementTree for IronPython (a pyexpat emulation for IronPython which lets you use the standard ElementTree module on that platform)

Oren Tirosh: ElementBuilder (friendly syntax for constructing ElementTree:s)

Staffan Malmgren: lagen.nu (a nicely formatted, hyperlinked, linkable, and taggable version of the entire body of swedish law) (more information)

Ralf Schlatterbeck: OOoPy (a tool to inspect, create, and modify OpenOffice.org documents in Python)

Martijn Faassen: lxml (ElementTree-compatible bindings for libxml2 and libxslt).

Martin Pool, et al: Bazaar-NG (version management system)

Seth Vidal, Konstantin Ryabitsev, et al: Yellow dog Updater, Modified (an automatic updater and package installer/remover for rpm systems)

Michael Droettboom: pyScore (a set of Python-based tools for working with symbolic music notation)

Ryan Tomayko: Kid (a template language)

Ken Rimey: PDIS XPath (a more complete XPath implementation)

Roland Leuthe: minixsv (a lightweight XML schema validator written in pure Python)

Bruno da Silva de Oliveira, Joel de Guzman: Pyste (a Python binding generator for C++)

Works in progress:

ElementTree: Working with Qualified NamesUsing the ElementTree Module to Generate Google RequestsA Simple Technorati ClientUsing Element Trees to Parse WSDL FilesUsing Element Trees to Parse XBEL FilesUsing ElementTrees to Generate XML-RPC MessagesGenerating Tkinter User Interfaces from XMLA Simple XML-Over-HTTP ClassYou Can Never Have Too Many Stock Tickers!

Download Source Code

See below for additional instructions.

Download for Windows

If the installer cannot find your Python interpreter, see this page.

Comment:

On some Linux systems, notably Debian-based systems, you’ll need to have the Python2.3-dev (or Python2.4-dev) package installed in order to be able to compile C extensions.

Posted by Berco (2006-11-17)

 this page was rendered by a django application in 0.54s 2008-01-22 14:00:08.558287. hosted by webfaction.