Caching and levers
Contents
Some of the people who question the way I design systems, notice that I'm a big fan of caching. While caching is a widespread and a highly used technique of performance optimization, I tend to use it—and rely on it—more frequently than others. Buy why's that?
What is "caching"
It seems that the people actually mean different things when they say "caching". In my understanding, caching means exchanging memory for computation by storing the exact results of computation in memory, and re-using them instead of performing actual calculations.
This actually resembles one of the concepts of physics. When you study how levers work, you often hear that to lift a heavier object you are to be putting the effort at greater distance.
The same way you may trade memory to speed in computing, if you use the appropriate tools.
There are different lever classes, which are based on the same principle, but work differently. And there are different caching types.
Memoization
Memoization is a caching technique, which saves the results computed by requests, and provides them if they are requested again.
Memoization can be utilized, for example, in recursive calculation of Fibonacci numbers:
Prefetching
Prefetching is a caching technique, which computes more than requested, and provides the data computed if they're requested in the future. That's how cache in your CPU works: when you load a memory cell into a register, it first loads it to a fast, "cache" memory, and also makes sure that a certain amount of the nearest cells are also loaded. You may boost performance if you write your programs to employ locality—i.e. make it more likely that you'll not request data from here and there, but try to make your memory queries concentrated.
***
Essentially, you can do only memoization, or only prefetching, or a combination of both. But there doesn't seem to be an agreement, what caching is. For example, Wikipedia article about caching defines caching as prefetching. It refers to architecture of processors (at least, to how it's implemented in x86), but the widespread understanding of caching goes beyond prefetching.
In this article I'll understand both prefetching and memoization as variations of caching.
Why I adore caching
I'm a big fan of caching because it supports one of the most wise ways of doing programming, the lazy approach.
Programmers work very hard to be lazy. Laziness is one of the basic virtues of a programmer. To be lazy partly means to design reusable systems that could be easily understood and modified. But how many times you were to design of a complex system, had a hard time subdividing it into components just to understand that the design is awfully unperformant?
To make the system more performant you would have to mingle several clearly defined layers into a horrible mess, to tangle otherwise isolated components, and to make your system's architecture more like a mess.
And here's where caching comes to help you. Instead of redesigning the structure of the system, you can just cache the data which your isolated components would re-calculate due to their independency. You'll see examples of it later.
Don Knuth once produced a famous quote, "...say, 95% of the times, premature optimization is a root of all evil". Ironically, the quote itself is a root of a lot of evil. Bit it actually supports the programming virtue I told about. And we can be sure, that caching can never be called a "premature optimization".
Where I used caching
Without concrete examples, of course, the bare words don't sound assuring enough. So here's when I enocuntered nice opportunities to use caching.
The first example: I use it in my blog to cache the HTML version of pages. They originally are written in a special markup language, which is parsed by an auto-generated from a grammar recursive descent parser implemented in Ruby. (It's called Treetop). It takes about 0.2 seconds to render a blog post alone, so without caching you could hardly read my blog.
Ironically, the parser also uses caching internally to memoize the nonterminal symbols it successfully parsed during the failed options of parsing other nonterminals.
Prefetching is especially useful when working with stores which process batch requests faster than indicidual ones. These include databases and file systems. In one of my first projects, I was given a task to store an abstract syntax trees of C++ header files in an SQL database. (you an see it here). As ASTs are loosely connected, I couldn't reliably determine what nodes will be requested at the next step of depth- or breadth-first traversal across the graph. But what I knew is that they were stored quite tightly. So I made the program request 1000 rows from database instead of one. That saved a lot of calls to the database.
Ironically, the pages you would see if you click the link to sample headers above are also cached. :-)
Of course, I wasn't the first who cached the results of compilation. There is a whole project, ccache, devoted specifically to this: it maintains compilation cache and uses it when the same file is going to be compiled.
While in a single project cache could be easily replaced by a Makefile, it's useful when you're rebuilding several projects independently. For example, in Gentoo Linux different versions of a system application are built separately, by design of its build system. These builds don't know anything about each other, and may be distant in time. However, if you first compile version 1.5.14 of a library, and then upgrade it to 1.5.15, a considerable fraction of the very same object files will be compiled again. And again, caching allows to improve performance, while having a clean architecture.
During one of the latest Google Summer of Code projects, a colleague of mine implemented a system, which closed Linux kernel modules under callgraph dependencies. For each of 4000+ Linux drivers, it produced a dozen of files, on which it depended. And then another component had to compile each of these files, as it was a completely independent module with a stable and developed behavior. It was obvious that instead of compiling 50.000 - 100.000 of files, it should cache the result of their compilation.
This way we traded 10 Gb of memory for several days of running time. Worth it, isn't it?
What about convenience?
Caching could actually be extremely convenient to use and, especially, to insert into a developed application. Here's how caching code of my blog looks like:
It actually means "fetch from cache by the key provided, or, if no data for the key is stored, run the functional block provided to recalculate the data, save to cache, and return the thing, which was just calculated, instead". It's so much syntactic sugar, that it may even cause diabetes... but it's the most convenient caching interface I saw. Essentially, every time you perform memoization, you have to re-implement the exact steps (see that Fibonacci example above). And a good caching interface, like the Ruby binding for memcached featured here, includes that. You may view the Railscast about using memcached for caching this way.
Most likely, you'll not need any rewrite of code of the components you want to introduce caching to. In the example above, which featured database prefetching the code was inserted to fetch-one-row function, which secretly prefetched and queried the local cache on each invocation.
Caching also provides you certain tuning capabilities. You can limit the cache size, and discard Least Recently Used items if you're going to exceed its size. This allows you to tune, how much memory you'll trade for performance, and it works in quote a straightforward, measurable way. Just like levers.
Conclusion
Of course, caching can't help in all situations where you have performance problems. If you're unsure if you will be able to employ caching, do the modelling, as wise men suggest.
However, caching helped me in many situations so far, and I seem to be saying, "don't worry, we'll cache this" too often to avoid being called a caching fan.