Using the ElementTree Module to Generate SOAP Messages, Part 3: Dealing with Qualified Names
November 22, 2003 | Fredrik Lundh
Note: A distribution kit containing the source code for this article is available from the effbot.org downloads site (look for ElementSOAP 0.2 or later).
XML Namespaces #
XML namespaces is an extension to the core XML specification, which lets you associate every element tag and attribute name with an URL. For example, all SOAP 1.1 elements belong to the http://schemas.xmlsoap.org/soap/envelope/ URL.
To be able to associate an URL with a tag or name, the XML namespace model relies on something called qualified names, or QNames. A qualified name consists of a namespace prefix and a local part, where the prefix maps to an URL via special namespace declarations (xmlns attributes) in surrounding elements. The declarations are part of the document markup; on the application level, only the URL/local part pair really matters. The namespace prefix is just an encoding detail.
For more information on namespaces, see James Clark’s XML Namespaces article.
The ElementTree library uses a generic namespace-aware parser, which automatically maps each qualified name to a universal name in Clark’s “{url}local” notation, and removes prefixes and namespace declarations from the parsed tree. After all, the XML namespaces specification clearly states that it only applies to element tags (types) and attribute names, and that applications should use the namespace URL, not the prefix.
Unfortunately, the SOAP designers didn’t actually read the namespace specification, so they’re using qualified names all over the place. You can see all variants in the following Fault response example:
<soap:Envelope xmlns:soap='...'> <soap:Body> <soap:Fault soap:encodingStyle='...'> <faultcode>soap:Server</faultcode> <faultstring>Argument must be 100 or less.</faultstring> <faultactor>/system</faultactor> <detail xmlns:xsi='...' xmlns:xsd='...' > <argument xsi:type='xsd:integer'>200</argument> <version xsi:type='xsd:string'>2.0 beta 1</version> </detail> ... </soap:Fault> </soap:Body> </soap:Envelope>
Here, soap:Envelope is an Envelope element associated with the namespace given by the xmlns:soap attribute. The ElementTree parser represents this tag as “{http://schemas.xmlsoap.org/soap/envelope/}Envelope“.
The Body and Fault elements are associated with the same namespace, and the Fault element also contains an attribute name from the same namespace; soap:encodingStyle. The parser maps these to “{url}local”-style strings as well.
But what about the faultcode element? The element tag doesn’t belong to a namespace, but that “soap:Server” text content sure looks like a qualified name. Let’s check what the SOAP specification has to say about fault codes:
The faultcode MUST be present in a SOAP Fault element and the faultcode value MUST be a qualified name as defined in [XML Namespaces], section 3.
Umm. So it is a qualified name. Who cares that section 1 of the namespaces specification says that XML namespaces only apply to element tags and attribute names; let’s just pick a section we like, and ignore the rest of the specification. Guess someone else has to sort out the mess.
And of course, SOAP uses qualified names not only in text sections, but also in attribute values, as can be seen in the argument element:
<detail xmlns:xsi='...' xmlns:xsd='...' > <argument xsi:type='xsd:integer'>200</argument>
(Here, the xsd:integer attribute string refers to the namespace given by the xmlns:xsd attribute in the parent element).
Handling Qualified Names in SOAP #
Update 2005-12-06: The iterparse mechanism in recent versions of ElementTree provides a more efficient way to deal with namespaces in non-standard locations.
So, you cannot use a generic namespace-aware parser if you want to properly process SOAP messages. What can you do about this? Here are some possible solutions:
- Ignore the whole mess. Make the SOAP layer ignore the prefix when it looks for type information and deals with faults, and hope that you’ll never have to deal with a SOAP service where this would be a problem.
- Create a SOAP-specific parser, and make it map xsi:type and faultcode contents to qualified names during parsing.
- Create a modified parser that makes namespace tables available to the application level. You can then use these tables to map prefixes to namespaces URLs in the SOAP layer.
In practice, solution 1 will work better than you may expect, since most servers are using only standard error codes, and tend to use unique names for custom types. You can use the following function to “clean up” the QName strings before using them:
def fixqname(qname): prefix, local = qname.split(":") return local
On the other hand, implementing solutions 2 and 3 is actually easier that it may sound, thanks to an experimental class in the ElementTree library. This class, FancyTreeBuilder, is similar to the built-in parser but calls hook methods whenever it enters or leaves an element during parsing:
from elementtree import XMLTreeBuilder class MyParser(XMLTreeBuilder.FancyTreeBuilder): def start(self, element): ... prepare element before adding it to the tree... def end(self, element): ... process element after adding it to the tree...
These hooks may modify the element (and its subelements, in the end hook) in place, and they also have access to a list of active namespace declarations via the namespaces attribute.
The following parser uses the start hook to attach a copy of the current set of namespace declarations to each element:
from elementtree import ElementTree, XMLTreeBuilder class NamespaceParser(XMLTreeBuilder.FancyTreeBuilder): def start(self, element): element.namespaces = self.namespaces[:]
To use the class when parsing, pass in an instance of the parser class to the parse function:
tree = ElementTree.parse(file, NamespaceParser())
When you use this parser, each element will have a namespaces attribute, which is a list of (prefix, uri) tuples, one for each namespace that applies to the current elememt. The following function takes a qualified name and the element the name is used in, and returns an Element-style full name:
def fixqname(element, qname): prefix, local = qname.split(":") for p, url in element.namespaces: if prefix == p: return "{%s}%s" % (url, local) raise SyntaxError("unknown namespace prefix (%s)" % prefix)
Here’s a version of the SoapService class that uses this parser, and uses the fixqname helper to deal with the faultcode element.
class SoapService: def __init__(self, url=None): self.__client = HTTPClient(url or self.url) def call(self, action, request): # build SOAP envelope envelope = Element(NS_SOAP_ENV + "Envelope") body = SubElement(envelope, NS_SOAP_ENV + "Body") body.append(request) # call the server try: parser = NamespaceParser() response = self.__client.do_request( tostring(envelope), extra_headers=[("SOAPAction", action)], parser=parser ) except HTTPError, v: if v[0] == 500: # might be a SOAP fault response = ElementTree.parse(v[3], parser) response = response.find(body.tag)[0] if response.tag == NS_SOAP_ENV + "Fault": faultcode = response.find("faultcode") raise SoapFault( fixqname(faultcode, faultcode.text), response.findtext("faultstring"), response.findtext("faultactor"), response.find("detail") ) return response
With this code in place, the invalid argument example from the earlier article now prints the expanded faultcode:
>>> g.doGoogleSearch("hello", maxResults=100) Traceback (most recent call last): File "myprogram.py", line 24, in doGoogleSearch File "ElementSOAP.py", line 78, in call response.find("detail") ElementSOAP.SoapFault: (u'{http://schemas.xmlsoap.org/soap/envelope/}Server', 'Exception from service object: maxResults must be 10 or less.', '/search/beta2', <Element detail at 9dfcb4>)
The new parser also opens up for some more automation when writing method wrappers; more on in a later article.