A Simple XML-Over-HTTP Class
Updated May 13, 2003 | July 12, 2002 | Fredrik Lundh
This module implements a simple helper class, HTTPClient, which can send an XML document (represented either as an element tree or a string) to a remote server, and parse the result into an element tree.
from httplib import HTTP from StringIO import StringIO import urlparse # elementtree (from effbot.org/downloads) from elementtree import ElementTree class HTTPClient: user_agent = "HTTPClient (from effbot.org)" def __init__(self, uri): scheme, host, path, params, query, fragment = urlparse.urlparse(uri) if scheme != "http": raise ValueError("only supports HTTP requests") # put the path back together again if not path: path = "/" if params: path = path + ";" + params if query: path = path + "?" + query self.host = host self.path = path def do_request(self, body, # optional keyword arguments follow path=None, method="POST", content_type="text/xml", extra_headers=(), parser=None): if not path: path = self.path if isinstance(body, ElementTree.ElementTree): # serialize element tree file = StringIO() body.write(file) body = file.getvalue() # send xml request h = HTTP(self.host) h.putrequest(method, path) h.putheader("User-Agent", self.user_agent) h.putheader("Host", self.host) if content_type: h.putheader("Content-Type", content_type) h.putheader("Content-Length", str(len(body))) for header, value in extra_headers: h.putheader(header, value) h.endheaders() h.send(body) # fetch the reply errcode, errmsg, headers = h.getreply() if errcode != 200: raise Exception(errcode, errmsg) return ElementTree.parse(h.getfile(), parser=parser)
The main workhorse is the do_request method, which uses the httplib library module for all protocol-related stuff. The HTTP class represents a connection to an HTTP server. The putrequest and putheader methods are used to generate the header part of an HTTP message, and send is used for the body. Finally, the getreply method is used to parse the response header, and getfile returns a file handle that can be passed right into the element tree parser.
You can use the path, method, content_type and extra_headers options to get better control over the request header:
- path
-
Overrides the path. If omitted, use the path extracted from the host URI given in the constructor.
- method
-
What HTTP method to use. The default is “POST”, but you can also use e.g. “PUT”, “GET”, and “HEAD”. Note that some methods doesn’t take a body; in that case, use an empty string for the body.
- content_type
-
What type to use for the body. The default is “text/xml”.
- extra_headers
-
A list of (header, value) pairs for extra headers needed by the server. For example, you can add SOAP’s SOAPAction headers to the mix, by passing in [(“SOAPAction”, action)].
Sending XML-RPC requests
Let’s put this class to use. The following example sends a pre-defined XML-RPC request to the effbot.org echo service, and prints the result.
request = """\ <?xml version="1.0"?> <methodCall> <methodName>echo</methodName> <params> <param><value>hello, world</value></param> </params> </methodCall> """ from HTTPClient import HTTPClient client = HTTPClient("http://effbot.org/rpc/echo.cgi") response = client.do_request(request) import sys response.write(sys.stdout)
Here’s the expected output:
<?xml version='1.0'?> <methodResponse> <params> <param> <value><string>hello, world</string></value> </param> </params> </methodResponse>
For more examples, see Using Element Trees For XML-RPC and You Can Never Have Too Many Stock Tickers!.
Notes:
The implementation currently ignores the charset parameter in the content-type headers. The HTTP protocol allows HTTP transports to convert documents between different encodings on the way (“transcoding”), usually based on accept-charset client headers. If you read data from such a source, the XML parser cannot figure out the encoding by looking at the document; it must use the charset specified by the server.
Also, a strict reading of the HTTP and XML Media Types specifications says that if you set the content-type to text/xml, without any charset parameter, the XML body cannot use 8-bit characters; the body is assumed to contain US ASCII only. This is no problem if you pass in element trees; the default encoding uses character entities for all non-ascii characters anyway. But you probably should keep this in mind if you’re generating the body outside the do_request method.