Repost: BigMemory: Heap Envy

Terracotta has announced the availability of BigMemory, which provides a large offheap cache through their Ehcache project. It is designed to avoid the GC impact caused by massive heaps in Java, at a license cost of $500 per GB per year, if I have my figures right.

The Reason We're Here

First, let’s understand the reason BigMemory exists at all: the nature of the JVM heap, the standard (“Sun”) JVM memory model.

In very basic summary, there are two generations of objects in the Sun JVM: a young generation and an old generation. Simply put, garbage collection occurs when a generation fills up.

A young generation’s garbage collection is normally very quick; the young generation space isn’t usually very large. However, the old generation’s garbage collection phase can be very long, depending on a lot of factors – the simplest of which is the size of the old generation. (The larger the old generation is, the longer garbage collection will take… more or less.)

The problem addressed by BigMemory is the existence of a large old generation (let’s say, larger than ten gigabytes). When you have a lot of read-only data (as in, a side-cache like Ehcache?), and an old-generation GC occurs, the garbage collector potentially has to walk through a lot of data that isn’t eligible for collection; that takes time, and slows the application down. Some operations cause it to block the application.

Let’s all agree: blocking the application is usually unpleasant.

Is BigMemory a Good Thing?

From the standpoint of garbage collection being bad, when used with large heaps: BigMemory looks like a huge win! No more GC associated with giant cache regions!

That’s not all there is to the situation. Let’s think about this.

You’re talking about optimization of a cache, a cache built around key/value pairs – a map with benefits, more or less. (Namely, expiration of data, and a few other things that we’ll discuss soon.)

Cache isn’t usually the main problem.

Cache is usually not the primary problem for an application; cache is a way of hiding how expensive data access has become, in most cases. It’s a symptom of your database being too slow, for whatever reason, and by whatever definition. (I’m ignoring memoization and computational results, which usually don’t end up being gigabytes’ worth of data anyway.)

Cache moves the problem around: in BigMemory’s case, the problem has gone from “slow data access” to “lots of data causes our app to slow down unacceptably.” There are still costs: serialization still takes time, key management is a problem, and merely accessing the data can be slow.

Key Management

Key management is a factor because you still have to know how to access your data. If you know the object’s primary key, of course, that’s an easy reference to use, but what if you don’t know the object’s primary key? (The object’s lost, that’s what.) A side cache is ideal for working with data that your application is well aware of, but not for data that your application has to derive.

Consider memcached; there are good books on memcached that actually suggest using a SQL query for a cache key! The key, in some cases, is larger than the data item generated by the query. Ehcache isn’t going to have a different solution.

Serialization and Offheap Access

Serialization factors in because of the way BigMemory works.

BigMemory allocates memory directly in the OS, through an NIO ByteBuffer, and manages references itself. That part is good, but there are two issues here: serialization and access time.

Serialization factors in whenever the OS’ memory is used: since the memory region is a set of bytes, Java has to serialize the cached objects into that memory region on demand. Serialization is not the fastest mechanism Java has – looking at the timings of remote calls in Java, serialization is more expensive than the network call itself is. Your offheap writes and reads are going to be slow, period, just through serialization.

Plus, accessing the byte buffer itself is slow (because accessing offheap memory is slow.) This is important, but less than it could be, because BigMemory identifies a “hot set” of data – they say 10% of the stored data – and keep it “on-heap” for rapid access (which presumably avoids serialization, too.)

This is a good thing, sort of – but it also has an implication for your heap size. I don’t know offhand how hard of a limit that hot set percentage is, but if it’s something that’s not adjustable, your heap size will always have to be able to handle the hot set’s size – let’s say 10% of the size of the offheap storage as a rough estimate.

This establishes a limit on the offheap size, too, because a JVM heap that’s too large (to satisfy the needs of the BigMemory offheap cache) would suffer the very same problems BigMemory is trying to help you avoid.

What about the GC time itself?

BigMemory doesn’t actually get rid of GC pauses. It only removes the need to garbage-collect the offheap data (again, roughly 90% of the data lives offheap, as a “cool set” compared to the 10% “hot set.”) Even Terracotta’s documentation shows GC pauses, although they haven’t demonstrated the tuning associated with the pauses.

Actual Impact of BigMemory

If your application is sensitive to any pauses at all, BigMemory’s going to… help a little, but not that much, because pauses will still exist. They’ll have less impact, and your SLA factors in very heavily here, but they’ll still be there, caused by key management at the very least.

The only way to fix those pauses is to fix the application, really. JVM tuning can help, but realistically, an application that needs a giant cache like that has been built the wrong way; you’re far, far, FAR better off localizing smaller slices of your data into separate JVMs, which can communicate via IPC, than you are by pretending a giant heap will make your troubles go away.

So should you buy BigMemory? – A comparison with other mechanisms.

Well, I’d say no, with all due respect to our friends at Terracotta. I actually took their tests and ran comparisons against the JVM’s poor ConcurrentHashMap, and found some really interesting numbers:

ConcurrentHashMap won, by a lot, even in a test calculated to abuse the cache, and we’re still factoring in garbage collection.

The problem statement looked something like this: can we run a large, simple key/value store, consisting of a large amount of data, and avoid garbage collection pauses of over a second? (Remember, this is the statement they’re using to justify the development of BigMemory in the first place.)

The answer is: yes, although the solution takes different shapes depending on what the sensitive aspects of the application are.

For example, on our test system, with a 90 GB heap and fifty threads, 50% read/write operations, we had an average latency of 184 uSec, with some outliers. The outliers cause our hypothesis to fail, however. (This addresses the actual performance of the cache, though: 50 threads accessing a single map, which … isn’t very kind.)

Further, the most important factor in accessing our map isn’t the size of the map or the GC involved, but the number of threads in the JVM. If we use the same 90GB heap, giant ConcurrentHashMap, twenty-five threads – our normal latency time drops to around 80 usec, again with outliers in the Sun JVM.

And total throughput? I’m sorry, but BigMemory fails here, too. A BigMemory proponent mentions having 200000 transactions per second on a 100GB heap. We were looking at the millions with ConcurrentHashMap – even with the latency impact of thread synchronization. With a more normal cache usage scenario – closer to a 90% read/10% write situation – the numbers for the stock JVM collections climb even more.

Back to latency: if we do something drastic – like, oh, use a higher-performance JVM like JRockit, even without deterministic GC – then we get similar latency numbers, except the time spent in GC disappears, to something like the 400msec range. Plus, our hypothesis passes muster, literally – giant heap, no GC pauses of over a second, no investment in anything (besides JRockit itself, of course, which Oracle suggests will be merged into OpenJDK.)

Now, here’s the summary: We took Terracotta’s test, factored out BigMemory, and got the same results or better, without rearchitecting anything at all.

If you rearchitect to distribute the “application” properly, you could get even better access times, and even larger data sets, with less discernable GC impact.

That’s power – and, sadly, it doesn’t make BigMemory look like a big splash at all.

Author’s Note: Repost.