Caching
August 2006 | Fredrik Lundh
Note: This article is a stub. A more detailed analysis of different simple cache implementations will be published later. See the notes below for a rough summary.
The following is a simple cache implementation, which is suitable for relatively small caches (up to a few hundred items), and where it’s relatively costly to create or reload objects after a cache miss (e.g. a few milliseconds or more per object).
class Cache: def __init__(self, maxsize=100): self.cache = {} self.order = [] # least recently used first self.maxsize = maxsize def get(self, key): item = self.cache[key] # KeyError if not present self.order.remove(key) self.order.append(key) return item def set(self, key, value): if self.cache.has_key(key): self.order.remove(key) elif len(self.cache) >= self.maxsize: # discard least recently used item del self.cache[self.order.pop(0)] self.cache[key] = value self.order.append(key) def size(self): return len(self.cache)
Usage:
cache = Cache(100) try: item = cache.get(key) except KeyError: item = item_factory() cache.set(key, item)
Note that this implementation uses an ordinary list to keep track of the access order, in order to be able to discard the least recently used items when the cache fills up. The list type isn’t really optimized for this purpose, but this approach is surprisingly efficient for smaller caches, especially when the cache accesses follow an 80/20 pattern.
Notes:
(100k accesses, 100 slots, 500 items, 10 ms object creation, 1k per object, integer keys) baseline: speed=1000s (0+1000) memory=0k memoization: speed=5.22s (0.22+5.00) memory=500k keep last: speed=989.86s (0.72+989.14) memory=1k clear all: speed=532.51s (0.55+531.96) memory=100k clear pop: speed=706.51s (0.76+705.75) memory=100k clear random: speed=400.61s (0.80+399.81) memory=100k clear lru: speed=317.05s (0.78+316.27) memory=100k alternative implementations: carlson: speed=320.22s (1.35+318.87) memory=100k http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252524 prodromou: speed=343.61s (17.88+325.73) memory=100k http://freshmeat.net/projects/lrucache/ the carlson implementation uses a manually maintained double- linked list, which is O(1), but is slower than a list for small caches. the prodromou implementation uses a sorted heap of cache items, which turns out to be a lot slower.