Everyone loves benchmarks, and I’ve been investigating whether some in-progress work on view generation has made a difference (it hasn’t). In the course of investigations, I ran ab a few times, and there are some of you who might thing this is a fun thing to read.
CouchDB views are stored on disk using the same BTree mechanism that it uses to store documents, so in theory, view lookups (by key) should be just as fast as document lookups. @janl and @mattetti and I were curious about the reality here. What we discovered is that CouchDB’s view key lookups are almost, but not quite, as fast as direct document lookups.
When you stop to think about it, this makes sense because view requests have the added overhead of ensuring that the view is up to date, before returning results. Even when the view needs no updating, there is still a cost associated with this operation. There’s a chance that we can lower this cost with the current work to allow cached view reads (eg stale data, for those times when low latency is more important than current data.)
Here’s what I found looking at a database of about 50 entries (not a big dataset at all, but the time differences show up even at this small size.) If I weren’t about to hop on the plane back to Portland, I’d try this on a multi-gig database but I don’t have one handy, and downloading through airport wifi…
These benchmarks were run on my MacBook, with Safari running a bunch of Flash ads in the background, and other non-ideal circumstances. I also reran them until I got a result set with a low standard deviation. There’s something going in CouchDB where an occasional request will take a few hundred times longer than all the others. Maybe this is Erlang garbage collection, maybe it’s my MacBook indexing Spotlight metadata…
The upshot:
View lookups: 1030.60 req/sec (with a concurrency of 10)
Doc lookups: 1192.73 req/sec (also -c 10)
If you’ve got a fresh install of CouchDB’s trunk on relatively decent hardware, and you’re not doing as well as my old MacBook, there are some things you should try before you start shouting “OMG CouchDB is teh lamez,” chief among them is making sure you’ve got the latest, greatest Erlang version.
Here is the embedded gist of the Apache Bench output for view key lookups.
And here is the full ab output for doc lookups.
In other news, I had a great time hanging out with Jan Lehnardt and other folks at QCon. San Francisco is a crazy town, and I always enjoy my visits. See ya next time, Bay Area!
1 comment on Some CouchDB Benchmarks
I have had problems with ab on Mac OS X (10.5 anyway) hanging on some random requests after you’ve done several thousand. This happens with both Erlang and Python servers, perhaps other stacks too. I think the kernel just fails at doing TCP sometimes.