The Consumer Interface
January 27, 2003 | Fredrik Lundh
The consumer interface is a simple “data sink” interface, used by standard Python modules such as xmllib and sgmllib.
Other examples include the GZIP consumer and PIL’s ImageParser class.
The consumer will typically convert incoming raw data in some way, and pass it on to a another layer. For example, XML parsers implementing this protocol usually parse the data stream into a stream of XML tokens (that is, start tags, character data, end tags, etc).
Interface #
- feed(data)
-
Process incoming data. The data argument should be a byte string. The application can call this method as many times as it wants (or not at all, if the source is empty). The data buffer may contain zero or more bytes of data.
- close()
-
No more data available. The application should call this method when it has reached the end of the source stream.
- reset() (optional)
-
Reset the consumer. Note that this method isn’t part of the core consumer protocol, and applications should be prepared to deal with consumers that don’t provide this method.
Examples:
try: reset = consumer.reset except AttributeError: pass else: reset()
or:
if hasattr(consumer, "reset"): consumer.reset()
Patterns #
Read a file piece by piece:
c = consumer(...) f = open(filename, "rb") while 1: s = f.read(8192) if not s: break c.feed(s) c.close() f.close()
Read and parse a file in a single operation:
c = consumer(...)
f = open(filename, "rb")
c.feed(f.read())
f.close()
c.close()
Read and parse a file as it arrives over a network (this example uses the asyncore library):
class protocol_client(asyncore.dispatcher): ... def handle_connect(self): self.consumer = consumer(...) ... def handle_read(self, data): self.consumer.feed(data) def handle_close(self): self.consumer.close() self.close() ...