≡ Menu

Repost: Rocket Java: Timing Method Calls

My data stores benchmark gives out timings in terms of nanoseconds, by making a list of response times and then applying statistical methods to get data out of the list. However, one part of that comes up every now and then: how do you measure the time it takes to issue a call?

After all, this is essential for profiling and benchmarking, says the common wisdom.

Let’s get something out of the way, first: if you’re really trying to profile your code, you don’t want to use a rough timer. Counting milliseconds is not a good way to profile; use a profiler. Good ones that come to mind are jmp, JProbe, JFluid (part of NetBeans), DynaTrace, even VisualVM.

(If I had to choose one and only one from that list, I’d choose DynaTrace. This is not really an endorsement. It’s just the one I’ve found most useful for me.)

So: we’re working on the premise that we’re interested in very simple things: invocation times, let’s say, which is probably 90% of the intent when the subject comes up.

First, let’s create a test case. We’re not looking for elegance or validity – we’re just creating code that will run something, anything. First, I’ll create an interface, with two methods:

public interface Thingy {
    List<Integer> getResults();
    void build(int size);
}

Now I’ll implement it with a concrete class:

public class ThingyImpl implements Thingy {
    List<Integer> list = null;

    public List<Integer> getResults() {
        return list;
    }

    public void build(int size) {
        if (list == null) {
            list = new ArrayList<Integer>(size);
        } else {
            list.clear();
        }
        Random r = new Random();
        for (int i = 0; i < size; i++) {
            list.add(r.nextInt());
        }
        Collections.sort(list);
    }
}

This code isn’t, well, rocket surgery, obviously. It’s no great shakes whatsoever. It just builds a list and sorts it, and provides access to the sorted list. It’s a Widget.

So we could find the time it takes to run, say, 1000 of these like this:

Thingy thingy=new ThingyImpl();
long start=System.currentTimeMillis();
for(int i=0;i<1000;i++) {
    thingy.build(1000);
}
long end=System.currentTimeMillis();
long elapsedTime=end-start;

On my system, that takes 318 milliseconds, for an invocation time of 0.318 milliseconds per call. That’s not bad, all things considered. However, the measurement is very coarse. If we tracked the actual elapsed time for each build() call, instead of measuring start and finish times for the entire block, we’d get… a long stream of zeroes, followed by a 7 or an 8 (depending on your granularity, which varies from system to system and OS to OS), followed by a long stream of zeroes again…

So we can’t apply statistics to it, because it’s not accurate in any way. It’s too coarse. Our median value is going to come out a zero. That’s clearly not right; 1000 * 0 is 0, not 318. That’s too great a deviation to be useful.

So we can measure nanoseconds, now, as well, with System.nanoTime(). That’s more useful, but it’s more work.

I used Java’s dynamic proxies to actually capture the timing data in a list. The proxies are pretty simple, but they look worse. Here’s the proxy I used for testing the Thingy:

public class SpeedProxy implements InvocationHandler {
    Object object;
    ExecutorService service = new ScheduledThreadPoolExecutor(20);

    public SpeedProxy(Object object) {
        this.object = object;
    }

    public Object invoke(Object proxy, final Method method, Object[] args) throws Throwable {
        final long startTime = System.nanoTime();
        Object returnValue = method.invoke(object, args);
        final long endTime = System.nanoTime();

        // do SOMETHING, anything, with the time here
        service.submit(new Runnable() {
            public void run() {
                System.out.println("runtime for " +
                        method.getName() + ": " +
                        (endTime - startTime) + "ns "+
                        +(endTime-startTime)/1000000.0+"ms"
                );
            }
        });
        return returnValue;
    }
}

There are a lot of things to look at here. But first! Don’t use this. The way it logs information (through the executor) is insufficient. You will slaughter your performance with this. The actual one I used for the test had an object that captured data with O(1) access time (it used a preallocated array) so it didn’t schedule anything at all. The scheduled executor here is convenient, but will kill your performance.

Don’t say I didn’t warn you.

Anyway, so this is really simple: the invoke() method takes an object (the object the proxy operates on), the method reference itself, and then any arguments. Then it calls the Method reference with the object reference (the instance on which the method is operating) with the arguments.

It surrounds this with longs that represent starting nanotimes and ending nanotimes. Values that come out of this will be in the tens of thousands – but considering an eyeblink takes tens of millions of these, it’s still pretty bloody fast.

And it’s reasonably accurate. (If you need more, use an oscilloscope. Way beyond scope of this post.)

The last thing this does is, of course, to schedule data capture. I did it this way because logging synchronously like this is horribly expensive; I just pushed off the delay and let the method return.

So how do we use this? Well, with more ugly code, of course. Here’s how I got a Thingy reference with the proxy installed:

Thingy thingy2 = (Thingy) Proxy.newProxyInstance(thingy.getClass().getClassLoader(),
                new Class<?>[]{Thingy.class},
                new SpeedProxy(new ThingyImpl()));

All together now: “Ewwwwww.”

However… this means that any call through the Thingy interface gets intercepted and logged (or tracked, or whatever you want to do.)

That calls into question, though: how much time does the proxy itself introduce to the method call? That’s easy: do two tests, one without the proxy (and manually tracking the elapsed nanoseconds) and the other with the proxy in place. The difference will be the time taken by the proxy call.

On my system, it turns out that the direct build() invocation took 302637 nanoseconds (0.302 milliseconds). The proxied version took ever-so-slightly longer, at 324401 nanoseconds – the proxy introduced a 0.02 millisecond difference.

Some notes: yes, I ran a warmup before testing this, so the JIT had had a chance to fix up the runtimes. I also used the results of the build(), so that the JIT couldn’t just discard the result altogether. (Remember, we’re interested in the side-effect, but we don’t want the primary effect to go away.)

So. We’re remembering to use a profiler if we’re really trying to profile and not just track, but what else is there?

Lots of people are now rabidly shaking their heads, saying “Why didn’t he mention…” Well, this section is for you. Sort of. I’m not going to go into a lot of detail; there are too many options.

The Proxy mechanism here is very, very useful. However, it’s also very verbose; that Proxy.newInstance() call is a horror. There should be a better way.

And there is: it’s aspects.

Aspects are, well, hmm. It’s not fair to say that aspects are limited to making this kind of proxy code easier to work with, because aspect-oriented programming can actually do a lot more. However, one use for aspects is to put wrappers before, around, and after method calls, much as we did here (although all we did was “around.”)

It’s worth considering, especially since aspects are how Spring does a lot of its transactional magic. See “Aspect Oriented Programming with Spring” from the Spring 3 docs for more information on this.

Of course, you don’t have to use Spring to use aspects; check out the AspectJ project, cglib, javassist, asm, and… well, these are the libraries that came to mind with no thought. Google might have many others.

Author’s Note: Reposted, but not necessarily updated.

{ 0 comments… add one }

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.