Repost: Rocket Java: Timing Method Calls

My data stores benchmark gives out timings in terms of nanoseconds, by making a list of response times and then applying statistical methods to get data out of the list. However, one part of that comes up every now and then: how do you measure the time it takes to issue a call?

After all, this is essential for profiling and benchmarking, says the common wisdom.

Let’s get something out of the way, first: if you’re really trying to profile your code, you don’t want to use a rough timer. Counting milliseconds is not a good way to profile; use a profiler. Good ones that come to mind are jmp, JProbe, JFluid (part of NetBeans), DynaTrace, even VisualVM.

(If I had to choose one and only one from that list, I’d choose DynaTrace. This is not really an endorsement. It’s just the one I’ve found most useful for me.)

So: we’re working on the premise that we’re interested in very simple things: invocation times, let’s say, which is probably 90% of the intent when the subject comes up.

First, let’s create a test case. We’re not looking for elegance or validity – we’re just creating code that will run something, anything. First, I’ll create an interface, with two methods:

public interface Thingy {
    List<Integer> getResults();
    void build(int size);
}

Now I’ll implement it with a concrete class:

public class ThingyImpl implements Thingy {
    List<Integer> list = null;

    public List<Integer> getResults() {
        return list;
    }

    public void build(int size) {
        if (list == null) {
            list = new ArrayList<Integer>(size);
        } else {
            list.clear();
        }
        Random r = new Random();
        for (int i = 0; i < size; i++) {
            list.add(r.nextInt());
        }
        Collections.sort(list);
    }
}

This code isn’t, well, rocket surgery, obviously. It’s no great shakes whatsoever. It just builds a list and sorts it, and provides access to the sorted list. It’s a Widget.

So we could find the time it takes to run, say, 1000 of these like this:

Thingy thingy=new ThingyImpl();
long start=System.currentTimeMillis();
for(int i=0;i<1000;i++) {
    thingy.build(1000);
}
long end=System.currentTimeMillis();
long elapsedTime=end-start;

On my system, that takes 318 milliseconds, for an invocation time of 0.318 milliseconds per call. That’s not bad, all things considered. However, the measurement is very coarse. If we tracked the actual elapsed time for each build() call, instead of measuring start and finish times for the entire block, we’d get… a long stream of zeroes, followed by a 7 or an 8 (depending on your granularity, which varies from system to system and OS to OS), followed by a long stream of zeroes again…

So we can’t apply statistics to it, because it’s not accurate in any way. It’s too coarse. Our median value is going to come out a zero. That’s clearly not right; 1000 * 0 is 0, not 318. That’s too great a deviation to be useful.

So we can measure nanoseconds, now, as well, with System.nanoTime(). That’s more useful, but it’s more work.

I used Java’s dynamic proxies to actually capture the timing data in a list. The proxies are pretty simple, but they look worse. Here’s the proxy I used for testing the Thingy:

public class SpeedProxy implements InvocationHandler {
    Object object;
    ExecutorService service = new ScheduledThreadPoolExecutor(20);

    public SpeedProxy(Object object) {
        this.object = object;
    }

    public Object invoke(Object proxy, final Method method, Object[] args) throws Throwable {
        final long startTime = System.nanoTime();
        Object returnValue = method.invoke(object, args);
        final long endTime = System.nanoTime();

        // do SOMETHING, anything, with the time here
        service.submit(new Runnable() {
            public void run() {
                System.out.println("runtime for " +
                        method.getName() + ": " +
                        (endTime - startTime) + "ns "+
                        +(endTime-startTime)/1000000.0+"ms"
                );
            }
        });
        return returnValue;
    }
}

There are a lot of things to look at here. But first! Don’t use this. The way it logs information (through the executor) is insufficient. You will slaughter your performance with this. The actual one I used for the test had an object that captured data with O(1) access time (it used a preallocated array) so it didn’t schedule anything at all. The scheduled executor here is convenient, but will kill your performance.

Don’t say I didn’t warn you.

Anyway, so this is really simple: the invoke() method takes an object (the object the proxy operates on), the method reference itself, and then any arguments. Then it calls the Method reference with the object reference (the instance on which the method is operating) with the arguments.

It surrounds this with longs that represent starting nanotimes and ending nanotimes. Values that come out of this will be in the tens of thousands – but considering an eyeblink takes tens of millions of these, it’s still pretty bloody fast.

And it’s reasonably accurate. (If you need more, use an oscilloscope. Way beyond scope of this post.)

The last thing this does is, of course, to schedule data capture. I did it this way because logging synchronously like this is horribly expensive; I just pushed off the delay and let the method return.

So how do we use this? Well, with more ugly code, of course. Here’s how I got a Thingy reference with the proxy installed:

Thingy thingy2 = (Thingy) Proxy.newProxyInstance(thingy.getClass().getClassLoader(),
                new Class<?>[]{Thingy.class},
                new SpeedProxy(new ThingyImpl()));

All together now: “Ewwwwww.”

However… this means that any call through the Thingy interface gets intercepted and logged (or tracked, or whatever you want to do.)

That calls into question, though: how much time does the proxy itself introduce to the method call? That’s easy: do two tests, one without the proxy (and manually tracking the elapsed nanoseconds) and the other with the proxy in place. The difference will be the time taken by the proxy call.

On my system, it turns out that the direct build() invocation took 302637 nanoseconds (0.302 milliseconds). The proxied version took ever-so-slightly longer, at 324401 nanoseconds – the proxy introduced a 0.02 millisecond difference.

Some notes: yes, I ran a warmup before testing this, so the JIT had had a chance to fix up the runtimes. I also used the results of the build(), so that the JIT couldn’t just discard the result altogether. (Remember, we’re interested in the side-effect, but we don’t want the primary effect to go away.)

So. We’re remembering to use a profiler if we’re really trying to profile and not just track, but what else is there?

Lots of people are now rabidly shaking their heads, saying “Why didn’t he mention…” Well, this section is for you. Sort of. I’m not going to go into a lot of detail; there are too many options.

The Proxy mechanism here is very, very useful. However, it’s also very verbose; that Proxy.newInstance() call is a horror. There should be a better way.

And there is: it’s aspects.

Aspects are, well, hmm. It’s not fair to say that aspects are limited to making this kind of proxy code easier to work with, because aspect-oriented programming can actually do a lot more. However, one use for aspects is to put wrappers before, around, and after method calls, much as we did here (although all we did was “around.”)

It’s worth considering, especially since aspects are how Spring does a lot of its transactional magic. See “Aspect Oriented Programming with Spring” from the Spring 3 docs for more information on this.

Of course, you don’t have to use Spring to use aspects; check out the AspectJ project, cglib, javassist, asm, and… well, these are the libraries that came to mind with no thought. Google might have many others.

Author’s Note: Reposted, but not necessarily updated.

Repost: Rocket Java: Spring Injection of Other Beans’ Properties

Woot, IRC FTW! Someone today asked a question about Spring that was actually relevant for once.

The problem was that he wanted to inject not a bean, but a bean’s referenced property into another bean.

Put more succintly: He had a bean A, with a property B, and didn’t want to inject A into bean C, but wanted to inject the result of A.getB().

This is actually pretty easily done… as long as you’re on the Spring 3.0 revision. The key is the spring-expression module; it lets us, well, use expressions in our Spring configuration.

So let’s build a sample and show this puppy in action. First, we’ll create our two classes, as referenced above, using Lombok to create our mutators and accessors:

public class A {
   @Getter @Setter String b;
}

public class C {
   @Getter @Setter String d;
   public void run() { System.out.printf("%s%n", getD()); }
}

So our goal is to have run() output the value of A.b, even though C has no reference to A.

With spring-expression, the configuration file is really pretty simple. It looks like this:

<beans>
    <bean id="a" class="sandbox.A">
        <property name="b" value="hello, world"/>
    </bean>
    <bean id="c" class="sandbox.C">
        <property name="d" value="#{a.b}"/>
    </bean>
</beans>

The expression is the key. “#{a.b}” means to follow the object hierarchy – including bean references, which is where the “a” comes in – and use property b from reference a.

It’s as simple as pi. (Because, after all, pi is just 3.14159… wait. pi is an irrational number. If only I could make pie!)

Incidentally, as a bit of an aside: I used Lombok up there to add mutators and accessors via annotation, which is really convenient; however, if you use it in IDEA, note that the editor in IDEA isn’t able to infer the new methods, so it shows me lots of red. I really dislike warnings (and hints of warnings) in my code, and errors… horrors! (I’m error-phobic.)

But it runs, folks. I promise.

Author’s Note: Yet another repost.

Repost: Rocket Java: Initializing arrays of arrays

From IRC, from whence many “interesting” questions come:

“If I have a 4 dimensional integer array in Java, is there a way to initialise every element in it to a particular value?”

Well. First off, you don’t have a four-dimensional array. You have a one-dimensional array of one dimensional arrays of one dimensional arrays of one dimensional arrays.

Every one of these arrays have their own length; they’re not uniform. Every one of these arrays have their own reference in the heap; they’re not contiguous.

Therefore, we’ve eliminated C-like memset() calls, in one fell swoop. Swoop, us! Swoop! (Swooping is fun.)

The key here is to think about how arrays like that are even initialized: chances are, with lots of loops.

Let’s say you want a 4x4x4x4 array of integers. Here’s code to initialize one of the endpoints to the default values for an array of ints, zero:

int[][][][] myArray=new int[4][][][];
myArray[0]=new int[4][][];
myArray[0][0]=new int[4][];
myArray[0][0][0]=new int[4]; // we will be playing with this line some.

Now, you can always use a different value if you lose the dimension on the last line. Setting all four values to one would look like this:

myArray[0][0][0]=new int[] {1,1,1,1};

That’s not very flexible, though.

The Java API has Arrays.fill(), which works for multiple primitive types (and Object), but the first argument isn’t an array of arrays – it’s a single-dimensional array. There are forms of the method that take ranges, which is nice, but the version linked to there will copy a value to each element in the list. (By the way, Arrays.fill(Object[], Object) seems faintly useless to me – if only that were a closure!)

Therefore, we could replace that last line, again:

myArray[0][0][0]=new int[4];
java.util.Arrays.fill(myArray[0][0][0], 14);

That’s a little better, but not much.

The bottom lines here: Java doesn’t have multidimensional arrays, so there’s not an “optimization” that leverages contiguous memory available. You can always slice your arrays yourself (by creating int[256], for example, and manually indexing, which would definitely give you regular arrays of arrays, which Java does not offer you), but otherwise… no. No multidimensional arrays.

Since there are no multidimensional arrays (and no contiguous arrays), to fill an array of arrays with a single value, you get to loop. Somewhere. Sorry, folks.

Author’s Note: Another repost.

Repost: Java’s more relevant than you think.

It’s really funny, honestly, but kids, Java isn’t cool. It hasn’t been cool for a long time.

Slashdot even posted “The Struggle to Keep Java Relevant,” which … wow. Not only is Java not cool, but it’s not even relevant any more. Now, that article is … odd, because “relevant” apparently means “used by people with piercings,” which is an odd metric, but still!

Here’s the thing, though: Java may not be cool, but the cool kids haven’t managed to replace it in anything other than their own hearts and minds yet.

There have been worthy attempts.

Some examples: Ruby, Python. Haskell. Erlang. Scala. Groovy. Clojure.

These languages are “dynamic” or “functional.” (Except Erlang, which is concisely and excellently described: “Erlang is a programming language designed at the Ericsson Computer Science Laboratory.” Well, then.)

Another interesting thing about them is that of all seven, five of them are able to target the JVM, and some of them the CLR as well.

It’s almost like they… well… honestly… they’re keeping the parts of Java that run well and mixing it with impenetrable syntax.

That’s a… loaded statement if I’ve ever seen (or written) one, but let’s think about it.

Look, I know the JVM can be improved (invokedynamic, for example, which is so necessary I can’t believe it’s not there yet), but by and large, it’s fantastic. It’s good enough that, compiling at need, it’s able to keep up with languages like C++ and C – languages that sort of set the bars for performance. It enables us idiot programmers to write simple, almost retarded syntax, and get it right almost every time. Neither it nor we are perfect, of course, but still… it works out most of the time to the point where it’s hard to tell.

That’s not cool. There’s not a whole lot of room for an ivory tower attitude when everyone’s on the first floor… but Java can do an incredible array of things well.

So what we have, now, is a host of dynamic or functional languages, most of which … target the JVM. The largest changes are in language concepts or syntax, both of which are relevant, but less so than you’d think.

A dynamic language is one which – if you’ll pardon the really poor definition – is able to change at runtime. That’s a development thing, of course, but one of the things the dynamic language people like is the lack of a formal compilation step. Write and run, baby! Find a problem? Fix it in place. Instant deployment.

The agile folks luuuuurve them some dynamic languages, because it enables an incredibly short development cycle.

Functional programming languages treat everything as a result. They tend to focus on immutability when they can. They’re what I used to think C was. Functional languages are really good at being known quantities; I want a functional language handling my airplane flight controls. (This is only partly written in jest.)

Academia luuuuurves it some functional languages, and with a few exceptions, academia is where functional languages live and die. (Some exceptions: Haskell, Erlang, Mathematica.)

I find it odd that functional languages don’t live on the JVM all that often, although many JVM-targeted languages include functional language hallmarks. (Scala is usually considered a sort of functional language, too.)

Anyway… while I certainly have no intent of running down these languages (especially considering my secret love I do the harboring for in my heart in my chest for Scala) I can’t help but think they all fail the elevator pitch test.

See, let’s look at Perl for a second. I can explain Perl to my eight-year-old. “Do this, unless that.” It works. There are a lot of bits in there that confuse everyone not named “Larry” or “Randal” but by and large, Larry Wall designed the language to work more or less like we think, and handle expressions that way.

Perl tends to be write-only largely because we think in impenetrable fashions, and the necessities for making a computing language out of how we think introduces some funky stuff.

I can likewise explain C in a minute or two. (Maybe not to my eight-year-old, although my oldest son started learning C at nine.) It’s a long elevator ride, but still. An elevator ride’s time.

Java’s a little harder, but the syntax is so close to C’s – well, leaving out generics – that Java’s language is fairly easy to pitch about in an elevator.

Now it gets harder. Ruby’s not hard, nor Python, nor Scala, nor… you get the picture, but if I had to explain them in an elevator, well, hmm. The idioms just get weird for people who aren’t into the idioms already, or who aren’t interested in climbing up stairs in that there ivory tower.

Again, it’s not impossible. The thing about these ivory towers is that you can climb them. It’s not a Goldberg contraption; it’s also not a simple ramp.

Java, on the other hand, is a ramp. Simple, predictable syntax. Again, I’m ignoring generics, la la la la – generics are easy, I won’t use Java without them any more, nor have I been willing to use Java without them for a while now, but explaining them to a newbie can be daunting, partly because they’re not real.

You can write some funky write-only keyboard vomit in Java, too, but it’s harder. The language is just too simple.

So what’s the point here?

Is it that the newfangled languages are awful, that people should use COBOL^WJava? Nope.

Is it that Java’s still the cool kid on the block? Not really – although I think it’s cooler and more relevant than it gets credit for.

Is it that people should stop trying to compare languages? Strike three! (You’re not out; this ain’t baseball.) Language comparisons are important, because they give us new ways to look at solving problems, and how would we know for what language X is better than language Y in short order unless it’s explained in a comparison?

I guess it’s just frustrating to have a language – any language, whether it’s COBOL, FORTRAN, Java, C – run down in favor of the latest entry, when the market has spoken in favor of the old guard.

I’m not all about the job market – like I mentioned already, Scala is my friend, I will hug it and kiss it and call it George – but I do have a problem when a language’s adherents accuse those who don’t use their language of choice as being uncool.

It makes programming political.

I really don’t like that.

With syntax being relevant for a mass-market language, and a runtime that’s tuned well enough that a lot of these languages use it as an operating environment – Java’s not only relevant, it’s important.

My point, then, comes down to a curmudgeonly … annoyance, because like it or not, Java is relevant, very much so, and no amount of Slashdotting or handwringing about how cool DHH is is going to change that. Deal with it.

(ETA: Look, folks, I know Ruby, Python, Perl, Scala fairly well, IMHO – just like I know Java “fairly well” – and I’m not suggesting one avoid any specific language. Plus, if you’re going to get upset about it, “impenetrable syntax” wasn’t meant to be taken literally; not only do I know many of the languages I mentioned well, but clearly I and others understand the syntax. Grains of salt, folks. Grains of salt.)

Author’s Note: A repost, predating Java 8 by a long time.

Repost: Rocket Java: What is Inversion of Control?

From IRC: “What’s Inversion of Control?”

First off: Oh, my.

Here’s a quick summary: traditionally, Java resources acquired whatever resources they need, when they need them. So if your DAO needed a JDBC connection, it would create one, which meant the more DAOs you created, for whatever reason, the more JDBC connections you used. It also meant that the DAO was responsible for its own lifecycle; mess it up, and you had problems.

This isn’t really a bad thing, honestly; lifecycle isn’t impossible to figure out. However, it means that the mechanism by which you acquired resources became really important – and if the mechanism wasn’t readily available, things got a lot more difficult to test.

Imagine a J2EE DAO, for example, where it used JNDI to get the JDBC DataSource. All good, until it’s time to do unit testing, and you don’t want to create a JNDI container just for testing – that’s slow and involves a lot of work that isn’t actually crucial to testing your DAO.

It’d be simpler for the DAO to not get its own resources, but accept the resources it needs. That means it no longer cares about JNDI (or how to get a JDBC connection) but it only says “I need to have a JDBC DataSource in order to work. If you want me to work properly, give me a DataSource.”

That’s inversion of control: instead of the control being in the DAO, the control is in what uses the DAO.

The implications are in many areas.

In production code, it means you want to have an easily repeated mechanism to create resources and provide them. A DAO is a resource; a DataSource is a resource; you want something to build both of them and manage giving the DAO the DataSource without you having to be involved much.

In testing, it means you have fine-grained control over what happens. You usually want to limit the scope of testing, honestly; testing a service that uses a DAO that uses a DataSource (that uses a database, of course) means: starting the database, establishing the connection to the database (the DataSource) and then creating the DAO and providing the Service with that DAO.

That’s lots of work. Too much work, really, and it means a lot of moving parts you don’t want.

With inversion of control, you create a DAO that has very limited functionality, just enough to fulfill the Service’ test. It might always return the same object, for example, no matter what data is requested. That means you’re not testing the DAO any more, nor are you establishing a database connection. This makes the test much lighter, and gives you a lot more control over what happens.

Need to test exception handling? Provide the service with a DAO that always throws an exception at a given point.

Need to test an exception that occurs later in the process? Provide a DAO that throws an exception at that later point (perhaps the fourth time you request data? – Whatever fulfills the need.)

Without Inversion of Control, this is much harder.

Implementations of IoC are pretty well known: Spring, Guice, even Java EE, now. Use them. You’ll be happier – and the Springbots won’t look at you as if you had a third eye any more, either.

Author’s Note: Reposted. This post predated CDI by quite a bit, unfortunately, so it’s badly dated; preserved for posterity.

Repost: Self-censorship.

As an artist – an artiste, thank you – I get all worried about creating stuff that people might find offensive. When I was nine or ten, for example, I wanted to write a poem that used the word “damn.” So I asked my mom – and she said I shouldn’t, that I should use “hot dog!” or something.

If I recall correctly, “hot dog” wouldn’t have worked – “hot dog it to heck!” just doesn’t sound right.

I also don’t think I ever actually finished that poem/thing/whatever it was.

Anyway, that still affects my art today. I’m pretty careful about what I write and publish (and create, for that matter.)

Basically, my rule is: I publish absolutely nothing about which I would be ashamed to tell my grandmother. That use of “damn” up there – oh phooey, now I’ve used it twice – is the first example of my choosing to actually publish something that uses, um, “harsh language.”

(Thanks, Mom.)

I don’t mind that form of self-censorship; it saves me from a lot of embarrassing circumstances. I don’t accidentally publish stuff that I’m not proud to tell my wife and children about because of it, you know? (At least, not that I know of… and I try to adhere to this, really.)

There’s another form of self-censorship, though, that bothers me.

I try to deal with emotions and thoughts and stuff – and sometimes those can be misinterpreted. As a songwriter, I tend to write about things that bother me, so a lot of my vocal music is angry – screaming at religious dolts, or growling at politicians (the “supplicants of power”), or occasionally pointing out self-destructive behavior (“strange rituals” or “unless you try to set me free…”).

I’m conflicted about this. On one hand, I still worry that someone might interpret some of my more vague writings as being aimed at them, but I can deal with that. A lot of my songs are aimed at people, they’re things I wanted to say and had to say.

But some of them are… a little too pointed. I have a song, for example, that addresses a really painful situation, and it’s a direct response to something… and while the lyrics are a little stilted (no singer, I) the song is good. I mean, really good. Stirringly so. (I captured the emotion, I thought, and how it applies to me.)

But I can’t make it public. The situation is something I still want to rescue some day, and it’s delicate enough that if my thoughts were exposed like this, it… wouldn’t be delicate any more. It’d be destroyed. It’s tenuous enough as it is, and I don’t want to make it worse.

This kind of self-editing bothers me a lot. I don’t like to hide; I don’t mind elephants in the room, but there’s no way I’m not pointing them out. It’s not fair to me, to have any other way, and it removes my ability to deal honestly with people.

Politesse is my enemy, too.

But this is a problem I don’t know how to solve; I could say “onward!” and put out a song that would prevent any hope for healing, and see myself as having that extra bit of artistic integrity, or I could hold back and keep hope alive.

Got fuel to burn, got roads to drive…

Author’s note: repost.

Repost: It’s all about the boundaries, baby

The key to distributed – and enterprise – computing is boundary management. Even worse, it’s not conceptual boundary management – it’s real boundary management, which is a lot harder to figure out.

A boundary is a barrier between things or places, right? Well, consider a web browser and a web server; the TCP/IP layer is a boundary.

The application has a boundary between it and its datastore.

You can play games with definitions here all day, and you’d be right: an object/relational mapper (an ORM) is a way to get around the boundary between your object-oriented language and your tuple-oriented database, for example.

The goal of a boundary, here, is to be as invisible as it can be. ORMs fail because they don’t work invisibly; they affect your data model fairly severely, they don’t leverage the strength of the underlying data store very well (although I suppose they’ll stay out of your way as you do an end-run around them); they affect your code pretty severely as well.

JMS fails, sort of, because it provides an excellent boundary between the originator of a request and the consumer of that request – but the implementations include that pesky network all the time. You can eliminate the network, but then you’re not really distributing.

SOAP… well, it’s just a six pails of fail. REST is better because it’s so much lighter, but it never hides the network layer from anything.

A distributed application should be able to cross network boundaries without making that boundary management stick out.

You should be able to submit a task or data item or event to a system and have that system cross boundaries if necessary, or localize processing if it can. It shouldn’t consume network bandwidth just because you said “this can run somewhere else.” It should use the network only if the local node is inappropriate for handling a given request.

Data should work this way, too, and here’s where I think most distributed computing platforms fail hard; they provide fantastic capabilities to share data (Terracotta DSO, Coherence, and Hadoop comes to mind) or the ability to distribute processing (JMS and GridGain comes to mind) but these have limits.

When you can share data but not processing, you have localization and sharding concerns (some of which are supposed to be manageable by the platform. DSO is supposed to be able to do this, for example, although I couldn’t find any references quickly that discussed how.)

When you can share out a process but not the data, you end up creating a choke point through accessing your data, or you have to carry all your data long with you. (Hello, JMS and friends!)

The key here is to keep your eye on the boundaries between your architectural components – and if those boundaries slow you down, well, that’s where your potential for optimization lies.

It’s impossible to tell you generically whether the optimization is sharding a database for local access to critical data, or task partitioning such that map/reduce is optimized for the mapping phase, or any of a number of other potential optimizations. The distribution platform you’ve chosen will massively impact what your options are here.

Don’t forget that key, though: if you manage the boundaries between components well, your application is going to win.

ObEmployerReference: I work for GigaSpaces, and IMO GigaSpaces XAP manages the boundaries impressively well, automating most of it out of your hands. If you’d like to see how that happens, feel free to point out a problem space and maybe we can show you how XAP can help with it.

Author’s Note: Repost, clearly having been written while I worked for GigaSpaces. Some concepts are out of date, because some of the ideas caught on; this is preserved for posterity.

Repost: Rocket Java: if(s==null) {s=t} vs. s=(s==null?t:s);

From online again, names concealed to protect the guilty:

I’m trying to figure out the shortest way to do this:

if (anotherString != null) s = anotherString;

Now, if you’ll pardon the stream of consciousness, through discussion it turns out he’s trying to work out the shortest form:

s=((anotherString != null)? anotherString:s);

Or, paraphrasing in perl:

a = b if b

Interesting question, sort of. The “shortest form” is up to the coder, of course; why not count keystrokes if that’s what you actually care about? That said, it got me wondering: which is most efficient?

Well, my first thought is, as usual: I don’t know. I’d have to test. So that’s what I did: write two methods, one which does an if() and assigns if the expression evaluates to true, and the other which uses the ternary syntax.

I ran the expressions in a loop, 100,000 times, and ran the loops twice each (just in case warmup affected one or the other). Before we look at the times, though, here’re the expressions tested:

// here we generate the things to test.
// randomValue() returns null roughly half the time.
String s = randomValue();
String t = randomValue();

// evaluation, then conditional assignment...
if (s == null) {
   s = t;
}

// ternary assignment
s=(s==null?t:s);

My first thought was that the former would be faster, because it does the assignment only if s is null. That’s the only substantive difference, really; s and t both call randomValue() before the evaluation/assignment.

The bytecode sort of looks the same way. Here’s the if() form, in bytecode:

   36:  aload   5
   38:  ifnonnull       45
   41:  aload   6
   43:  astore  5
   45:  aload_3

And here’s the ternary expression:

   36:  aload   5
   38:  ifnonnull       46
   41:  aload   6
   43:  goto    48
   46:  aload   5

Note the second “aload 5” there!

Here’s the thing: this bytecode is just what’s generated from javac. It’s not what runs; HotSpot converts all this, and … HotSpot optimizes. The bytecode doing more loads doesn’t mean that the executed code runs more loads, nor does it mean that a load and a store are equivalent in runtime. So looking at the bytecode is interesting, but not really very relevant – not to me, at least.

(Josh Bloch would be able to tell you a lot more about what goes on with a given bytecode block than I could. But he’d also probably not bother responding to such a simple query in the first place. Heck, even I am responding because I’m curious, and I’m willing to expose my faulty thought processes for everyone to see.)

So running the code yielded some interesting results, sort of. After a warmup, the if() ran in 70ms longer (roughly, and consistently) than the ternary expression.

I’m not surprised; my initial thought was that the opposite would happen, but hey, this is why I thought I’d test before making a declaration in the first place.

Testing always trumps assumption.

Comments?

Author’s Note: Repost.

Repost: In search of the perfect expression

One of my … flaws, I guess, is that I consider myself an artist. I typically prefer the aural media to visual, but I’m not afraid to try visual media.

I guess a few definitions are in order.

Aural media is something you hear, obviously, but there are some forms that can be considered aural even though they are visually perceived. Words, for example, are aural, even when you read them, at least for me; when I read something, I hear it in my head, typically in my own voice (or, occasionally, Grover’s. But I don’t talk about that.)

Visual media, on the other hand, is anything that is enjoyed by the eye. A painting is visual (duh). A photograph, visual. A video, visual. (Go figure.)

There is also tactile media, stuff you feel – and I suppose there’s olfactory as well, although I have no idea how you would exploit those media artistically.

One of the hallmarks of Jewish culture is that it’s aurally-oriented, even though many individual Jews are not.  Hebrew thought is based around concrete concepts, which doesn’t always translate well to English, and as a result tends to be… picturesque, let’s say.

Western thought – Hellenistic thought – was based around theory, mind, abstract concepts – thus Aristotle thought women had fewer teeth than men by inference rather than actually, like, forming a hypothesis and testing it.

I find I tend to try to bridge the two modes. I’m an abstract thinker who translates everything into concrete expression, if you will; I prefer the abstract internally (thus, my songs tend to influence expressions, rather than express things themselves) but translate into concrete concepts to actually express them (and thus, the abstract concepts in my songs end up being translated into things like a concept of wind, or water, or growing things, because I’m a bit of a flake.)

I actually have a really hard time writing, even though I write as a facet of my everyday employment. (I can hear you now: “if it’s so hard, you dweeb, stop and save us the pain of reading it.” I am ignoring you.) The biggest challenge for me is maintaining a single, straightforward progression of thoughts.

I think in a sort of lattice-work; I can’t say “the sky is blue” without that thought being accompanied by “why are clouds white? How does a bird fly? How does a plane fly? I wonder if I could devise a better shape for submarines. How does Google Go compare to P2? Why have I never learned any Algol?”

If that’s a confusing chain of thought, I’m not surprised. Actually, writing it down makes the progression rather obvious; it demonstrates a marvelous lack of ability to stay on topic but hey! Imagine that happening all at the same time, and maybe you’ll see how my internal dialogue sounds.

Therefore, even art is hard for me, because I can’t ever decide who I am in terms of how I express myself. My written poetry tends to follow a sort of spondee rhythm (much like Allen Ginsberg’s “Howl,” which repeats “who…” to begin many lines.) The structure I follow helps me stay somewhat on topic; I use sort of a Hebrew expressive form of “A contrasted with B,” using a common prefix.

And that’s just the written word. Visually, the medium in which I struggle the most, has a much harder time, since I not only have no familiar structure to work from, but I don’t even think in imagery in the first place. I have little concrete art.

I have a drawing of my hand, which I’ll scan someday, and there’s the meatball thing I put online a bit ago, but … you’re really looking at the limits of my concrete, visual art that I find acceptable. The meatball – “Comet,” I called it – was actually more of an experiment in … well… smudging pastels.

It’s also an exercise in futility for me to actually focus on making a point, which is part of what makes songs easier for me than essays or paintings; in a song, the listener participates enormously, and I can build my own impressions without worrying about whether the listener “gets it” or not.

All I have to worry about in music is whether it sounds good; the rest is gravy.

With written word and visual media, though, it has to make sense. The reader/observer has to come away thinking “Oh, I see,” somehow – at least, that’s the mindset I have that upsets me so when using these mediums.

I’ve written an awful lot in an attempt to help bridge that divide, to get my own writing to have focus, and to help others find that same focus.

I doubt that’ll change.

What’s really funny: did you know that this blog entry started out as an exploration of the pull quote, to see if it could be used to make my written thought more linear by breaking out parentheticals into their own stream of text? And this is what you get out of me trying to find a way to do hundreds of footnotes in hypertext. 🙂

Author’s Note: Another repost.

Repost: Rocket Java: What type should I prefer, int or byte?

From online:

Is there any benefit of using byte intead of int? I have a case where the range of possible values is between 0..100, so an int would be way to much for this, a byte is also more than sufficent for this.

But then I think about the 32 bit architecture of today’s computers, and that 32 bit is the smallest addressable unit, so does it make any difference if you use byte or int?

Well, it does make a difference, but what those differences are depends on how you’re using this data item.

A byte in java is defined in the Java Language Specification as having a range between -128 and 127 inclusive. An integer is defined as having a range between -2147483648 to 2147483647, again, inclusive.

Therefore, the byte type is defined as being eight significant bits, and an int is defined as having 32 significant bits.

That does not mean that a byte takes one byte of heap, and an int takes four. The JVM is fully allowed to use four bytes (or eight) for both of them, but it will treat them as their base type when they’re used (allowing promotion where required by the specification.)

Here’s the quote from the JLS on integral type promotion rules:

If an integer operator other than a shift operator has at least one operand of type long, then the operation is carried out using 64-bit precision, and the result of the numerical operator is of type long. If the other operand is not long, it is first widened (§5.1.5) to type long by numeric promotion (§5.6). Otherwise, the operation is carried out using 32-bit precision, and the result of the numerical operator is of type int. If either operand is not an int, it is first widened to type int by numeric promotion.

So if you use a byte in your code, it gets treated like an int, even though the range will be constrained and assignment to a byte has to be typecast.

From a code perspective, yeesh, avoid bytes.

However, if you’re doing input/output, especially over the network, or if you’re communicating with an external program that needs bytes, well… a byte is obviously smaller than an int (or a long) so if you’re counting TCP/IP packets, and a byte is sufficient, it’s nice to have around. (If you’re communicating with an external package that needs a byte, well, obviously, you’ll need to use a byte.)

To actually work on the machine’s level with individual bytes, you’re better off thinking like Python does and using JNA or JNI, in my experience. It’s not that Java can’t do it, it’s that it’s such a pain in the rear that it’s rarely actually worth it to use in anger.

Author’s Note: Reposted as “rocket java” instead of the prior “java surgery,” because I always found the name unwieldy.