Rocket Java: A project to test Java 8’s More Strict Verifier

Java recently added a more strict verifier to the class loading mechanism. This isn’t really a bad thing, necessarily, because it conforms to what Java was always supposed to do – except a lot of projects now rely on the verifier doing what it’s always done.

For all intents and purposes, the Java 8u11 (and 7u65) release broke things, especially things that did classloader magic in specific ways.

Apparently the verifier’s been changed since 8u11 to allow the (now-prevented) behavior again – but the fix hasn’t made it into the public builds of Java 8, as of 8u20 and 7u67.

With some help from ZeroTurnaround, I put together a project on github, called brokenverifier, that exercises the class verifier such that it fails with behavior that older versions of Java (meaning prior releases) allowed.

Check https://github.com/jottinger/brokenverifier out; it should be easy enough to clone and run regularly, until Java’s verifier is changed again.

Rocket Java: Using different test resources in a single build

What happens when you have tests that need different resources with similar names?

Confusion, that’s what. Let’s map out a project so we can see what happens:

  • Parent Project
    • Test Resources 1
    • Test Resources 2
    • Library Code (depends on both test resources projects)

Now, in our problem code, the resources for the test data use the same names, so the library code won’t be able to deterministically work out which set of data is being used.

How does one work around this?

The first, and probably best, solution is to use a different build tool, one that gives more finely-grained control over the build lifecycle: Gradle comes to mind.

The other solution involves a little more work, and probably would be classified as a Maven hack.

What you would do is fairly simple, but verbose; you’d add two more projects, as integrations for the library and the dependencies, like so:

  • Parent project
    • Test Resources 1
    • Test Resources 2
    • Library Code
    • Integration Tests 1 (using Library Code and Test Resources 1)
    • Integration Tests 2 (using Library Code and Test Resources 2)

The problem with this kind of structure is that the Library Code project no longer has its own complete test structure; it can pass all of its own internal tests, without actually passing the integration tests, because those are in separate modules. Therefore, one might be tempted to write off failures in the integration test modules, and publish a flawed artifact to a central repository.

Again, the best approach is probably Gradle.

Rocket Java: Shadowing private variables in subclasses

This morning, a user on ##java asked if private variables could be shadowed in subclasses.

Upon being asked to try it and see, the user said that they didn’t have the time, which was ironic (and wrong), but errr… what the heck, I’ll write a test case to demonstrate that no, shadowing private variables (and retaining their reference) was not workable: the superclass will use its own reference, not a subclass’.

So let’s see some code:

public class T {
    private boolean b=true;

    public void doSomething() {
        System.out.println(b);
    }

    public static void main(String[] args) {
        T t=new U();
        t.doSomething();
    }
}

class U extends T {
    private boolean b=false;
}

Upon execution, this outputs true.

Now, would this change if we used an accessor instead? Let’s add one to T:

public boolean getB() { return this.b;}

… and nope, nothing changes. The main point here is actually that shadowing fields like this (shadowing the actual field, not methods) is a Bad Idea, and doesn’t work, private or not. (If you change the visibility away from private, the result still doesn’t change.

What’s more, you generally don’t want to do it, and wouldn’t want to even if it were possible, because … what would be your point? You’d be effectively trying to change the semantic of the field, if you needed to do this – and this alters the superclass/subclass relationship in fundamentally unhealthy ways.

If you need a different value in the field you want to shadow, don’t shadow – just offer a protected mutator, and have the subclass set the value.

Repost: Rocket Java: What is Map/Reduce?

Map/Reduce is a strategy by which one uses a divide-and-conquer approach to handling data. The division is normally provided along natural lines, and it can provide some really impressive performance gains.

The way it does this is by limiting the selection of data for which processing applies. Here’s a good way to think of it, with an anecdote originally provided by Owen Taylor:

Imagine if you had two types of popcorn at a party; one type is savory, and the other is sweet. You could have both types in a single bowl, and partygoers who wanted sweet popcorn could dig through the bowl looking for what they wanted; those who preferred savory popcorn could do the same.

Apart from sanitary concerns, this is slow.

What you could alternatively do, however, is provide two bowls: one with savory popcorn, the other with sweet popcorn. Then people would just line up for the bowl that had the popcorn they liked. Speedy, and potentially far more sanitary. Just line up in front of the neckbeard with the cold who keeps sneezing into his hand.

To explain it in computing terms, let’s determine a requirement with some artificial constraints.

Our project is to count marbles; we have four colors (red, green, blue, white) and our data storage doesn’t provide easy collation… or we’ve terabytes’ worth of marbles, which might be a more logical actual requirement.

The lack of collation is important, for our artificial requirements; ordinarily one would create an index in a database for each marble, based on color, and issue a simple “select color, count(*) from marbles group by color” and be done. With a giant dataset the select could be very expensive; it might walk the entire dataset based on color.

Let’s be real: the best approach for this actual requirement (assuming a relational database) is to use a database trigger to update an actual count of the marbles as the marbles are added or removed.

So what’s another approach? Well, what you could do is provide a sharding datastore.

Imagine a datastore in which a write directed data to one of many nodes, based on the data itself. If we had four nodes, we’d shard based on the marble color; node one would get all red marbles, node two would get all green marbles, node three would get all blue marbles, and node four would get all white marbles.

If we have direct access to our data – embedded in the sharding mechanism, perhaps – we can then count the marbles very, very, very quickly. We don’t even have to consider what color a given marble is, only what the node is. Our structure would look like this:

  • For each node, count the marbles the node contains, and return that count, along with the color of the first marble encountered.
  • Collect each count and build a map based on the marble color and the count.

If the shards have their own CPUs, then you can see how the runtime would end up taking only as long as it took to iterate over the largest dataset plus a touch of network communication (which, normally, won’t factor in by comparison to the first operation.)

The first step – where you send a request to each of the shards to collect data – is called the Map phase.

The second step – where you collate the data returned by the Map – is called the Reduction phase.

Thus: “Map/Reduce.” You map a request to many nodes, then reduce the results into a cohesive returned value.

Of course, I’ve simplified my requirements such that I only iterate over the marbles, gathering a simple count. Often, the real world is rarely so convenient; you’d be more likely to have marbles of multiple colors in each node.

In this case, you’re still iterating over each marble, doing a collation – definitely slower than a simple count, but your runtime is still going to only be as long as it takes to iterate over the largest node’s data.

This is really important, by the way. Your data distribution is critical. You’re not going to run in one fourth the time if you have four nodes; you’re going to run as long as it takes to process the node that takes the longest. If your nodes take 200ms, 220ms, 225ms, and then 1372ms… then your runtime is going to be 1372ms. Compare that to the 2017ms that it would take if you only had one node. If each one takes the same amount of time – let’s say 400ms – then your total runtime will be (roughly) 400ms.

The reduction would be a little more complicated if you’re looking at more than one color per shard, but not by much; you’d have a map of color and counts returned from the reduction already, so you’d simply be combining the results rather than building a map of the results in the first place.

Map/Reduce is a handy way to leverage multiple CPUs to gather data quickly – but it means you have to have your data sharded in the first place, in such a way that you can leverage the sharding. Hadoop and Gluster can do it, providing a filesystem as shards; in-memory products like Infinispan (produced by Red Hat, by whom I am employed), Gigaspaces, Coherence, Terracotta DSO, and even lowly GridGain can manage Map/Reduce, and the in-memory nature of the shards yields some truly impressive speed gains.

Map/Reduce is proven and useful, and one of the biggest logical drivers for “the cloud.”

Repost: Rocket Java: Use Maven.

Trying to build something destined for a JVM? Use Maven.

We know it sucks. We know you hate it. We know you’d prefer Gradle, or buildr, or Ant, or even make.

Tough. Just because Maven sucks is no reason to not use it.

It’s easy to say “but I know Ant,” or “Gradle has a nice declarative syntax,” etc. We know all these things.

The only time you should use a build tool other than Maven is when you’re being forced to (i.e., “your office is telling you to use Ant,” which happens distressingly often) or you’re playing around with a toy project that nobody else will see (which allows you to use Gradle, et al, on a lot of projects, realistically!), or you’re working with a community where everyone willingly and enthusiastically agrees to use some other tool.

For any project with real visibility to other people – like, a library that someone other than  you will use – Maven’s your build tool, for better or for worse, even if you’re able to make a build that yields something that looks like it was built from Maven (and yes, Reinier, I’m looking at you.)

Why?

Well, let’s think about it.

Ant is good, okay? It’s an excellent general-purpose scripting environment (if you ignore the XML aspect), and has worked admirably for a long time; it is well-known. However, it leaves everything up to you; every project is, essentially, a custom build.

The build might support versioned artifacts. The build might be able to support dynamic dependencies. The build might run tests as a standard build process. The build might support targeted profiles. The build might be able to publish artifacts publicly. The build might be able to generate documentation.

That’s an awful lot of “mights.” Ant can do all of these things, no doubt (been there, done that, have the t-shirt) but the fact that every build is custom should scream “this is a problem.”

It’s easy to standardize Ant builds, too, but that’s an artificial imposition; Maven’s best feature is that it more or less nudges you in the right direction by default, requiring you to do extra stuff to not have decent project build phases, executed in the right order, with the right capabilities.

Since people are essentially lazy, they naturally end up doing what causes them the least work, therefore Maven’s more-or-less-sensible defaults win.

That means you win.

Other build tools allow the Maven defaults and capabilities, without the Maven XML, of course. This is a strength, but the fact is that support for the other build tools is spotty; Maven is common enough that it’s everywhere, and you’re not going to have to wonder if Maven builds are supported by your toolset.

Your tools do support Maven. Period. (If you’re using Eclipse, that tool set support may be spotty, but what do you care? You’re using Eclipse. The rabbit’s already dead. And yes, I appreciate very subtle irony. Sue me.)

Why give in?

You don’t have to; you have your own projects and your own rules.

It could be argued that accepting Maven is “giving in” and not insisting on better tools. If that’s how you feel, that’s great; you don’t need my approval, certainly. But on the other hand, my goals in working with a project are usually “produce artifacts people can use,” not “explore neat new tools;” because I’m focused on yielding artifacts and not process, the tools I choose are focused on the same thing.

I’d strongly suggest you do the same.

Author’s Note: Another repost.

Repost: Rocket Java: That stupid classpath thing you should understand.

By far, the most common questions in Freenode ##java center around the concept of classpath. It’s funny, too, because the questions are often asked by people who – upon questioning – insist that they understand classpath, they really do… and then, upon having it explained to them in excruciating detail, manage to solve their problem.

By using the classpath properly.

To me, this sounds like the classpath documentation hosted by The Company Formerly Known As Sun is being unread, and when it is being read, it’s not being understood.

Time to fix this.

You may think you know all this.

The classpath is a list of resources in which other resources are located. It’s a set of starting points.

It’s a list of resources. Because it’s a list of resources, the list has an OS-specific aspect to it.

In UNIX, that means the list uses “:” as a separator; in Windows, it uses “;” as a separator. I’m going to use UNIX semantics here, because I prefer UNIX to Windows, despite using Windows. If you see “foo:bar,” then, and you’re on Windows, you’ll need to use “foo;bar” instead.

The elements in the classpath can be one of two types: archives or directories.

An archive for Java is normally a .jar file, although it can be a .zip file as well (a .jar file uses the .zip format, so this makes sense.)

A directory is a filesystem directory, and can contain other directories or files, of course. (This isn’t rocket surgery.)

Resources in Java have paths; for classes, the path is composed of the package name plus the class name (and the .class extension.) Therefore, a class that looks like this:

package foo.bar;

public class Baz {
  public static void main(String args) {
    System.out.println("hello, world");
  }
}

… will have a resource path of “/foo/bar/Baz.class”.

So let’s start applying this knowledge.

How the classpath works.

When you (or Java, or whatever) ask for a resource – let’s say a class, for example – it will look at the list of resources in the classpath, according to the classloader in use. (Pretend I didn’t include that last phrase about classloaders; it’s really important, but until you understand the classpath, classloaders are mad wizardry that you don’t need to acknowledge. Look at them after you understand the classpath.)

So let’s say we have two elements in the classpath: “foo.jar:bar”. This means that “foo.jar” in the current working directory is in the classpath, as is the “bar” directory in the current working directory.

Further, let’s say we are requesting the resource “foo.bar.Baz,” our class from earlier in this post. We might be requesting it with the following command line:

java -cp foo.jar:bar foo.bar.Baz

Java will then look inside foo.jar, looking for a file entry of “/foo/bar/Baz.class” – remember, this is our full resource path generated the compiler from that source file.

It uses foo.jar first because it’s first in the classpath.

If foo.jar contains “/foo/bar/Baz.class”, Java uses it and stops looking further in the classpath.

If foo.jar does not contain “/foo/bar/Baz.class”, then Java will look in the “bar” directory for “./foo/bar/Baz.class” — i.e., if the current working directory is “/home/username/app”, it will look in “/home/username/app/bar/” for “foo/bar/Baz.class” — with a full path being looked for of “/home/username/app/bar/foo/bar/Baz.class”.

If that file exists, Java will use it, and search no more.

If it doesn’t exist, it keeps searching until the entire classpath has been scanned; if the resource doesn’t exist, you’ll get a NoClassDefFoundError from java (or a null input stream, depending on what you’re doing.)

If this is unclear to you, please let me know – I’ll try to figure out how to make it more clear. (Part of the problem is that people who understand the classpath don’t understand what those who don’t understand the classpath are misunderstanding.)

Author’s Note: Another repost.

Repost: Rocket Java: how to swap(int, int) and have it work… sort of

So it’s a known fact that Java passes references by value. That means that if I pass an Object of some kind into a method, I can change the values in the object, but not the object reference itself.

Likewise, if I pass an int into a method, I can change the value of that int within the method block, but that doesn’t affect the scope of the int outside that block.

This means that doing stuff like:

void swap(int a, int b) { 
    // relax, folks, this is simpler to understand and closer to my main point
    int c=a; a=b; b=c; 
}

… won’t work.

After you call that method, with something like this:

int a=1, b=2;
swap(a, b);
// a==2 now, and b==1, right? Errr... no.

… a will remain 1, b will remain 2. Inside the method, the values will change (because the references on the stack will have changed) but outside, nothing.

However… Reinier Zwitserloot (of Lombok and http://tipit.to/, which I’m going to start using because I’m broke and I want a Mac, darn it!) pointed something out today about which I honestly had no idea.

So, being a curious sort who likes to play with things, I started batting about … the AtomicInteger.

With this, you can pass around a reference that contains a reference to an immutable object. That sounds dull and everything, but it’s actually a formal Java-approved way to do something most of us have done when we needed stuff like this: it’s a container object, and we can use that container object to change out references cleanly.

It’s actually rather neat, even if it looks uglier in code. Here’s a swap(), that operates on AtomicIntegers instead of Integers, and works:

void swap(AtomicInteger i3, AtomicInteger i4) {
    Integer v = i3.get();
    i3.set(i4.get());
    i4.set(v);
}

There’s still some lameness here, though: the use of AtomicInteger as a concrete type. In and of itself, it’s fine, but… it means if we have other types, we have to overload that method name.

All together now: “Eww. There has to be a better way.” And there is: generics. We can apply them to the swap() method, and use AtomicReference as the parameter type, like this:

<T> void doSwap(AtomicReference<T> i3, AtomicReference<T> i4) {
    T middle=i3.get();
    i3.set(i4.get());
    i4.set(middle);
}

Calling it looks like this:

AtomicReference i3 = new AtomicReference(1);
AtomicReference i4 = new AtomicReference(2);
System.out.printf("i3=%d, i4=%d%n", i3.get(), i4.get());
doSwap(i3, i4);
System.out.printf("i3=%d, i4=%d%n", i3.get(), i4.get());

It’s a little verbose, but it works. Java programmers are used to working around it somewhat (hey, it’s been fifteen years of pass by value!) but it’s still cool to see this put in the language API – especially since they’re threadsafe.

However… note that these methods are not threadsafe! In order to be threadsafe, they need to use a latch, to make sure the references don’t change between assignments. That’s another entry altogether, though, and is actually a thorny subject.

Refer to Java Concurrency in Practice for more detail, or watch this space; I’ll eventually get to it.

Author’s Note: A rather brave repost, and the first of this series I’d considered not reposting.

Repost: Rocket Java: Timing Method Calls

My data stores benchmark gives out timings in terms of nanoseconds, by making a list of response times and then applying statistical methods to get data out of the list. However, one part of that comes up every now and then: how do you measure the time it takes to issue a call?

After all, this is essential for profiling and benchmarking, says the common wisdom.

Let’s get something out of the way, first: if you’re really trying to profile your code, you don’t want to use a rough timer. Counting milliseconds is not a good way to profile; use a profiler. Good ones that come to mind are jmp, JProbe, JFluid (part of NetBeans), DynaTrace, even VisualVM.

(If I had to choose one and only one from that list, I’d choose DynaTrace. This is not really an endorsement. It’s just the one I’ve found most useful for me.)

So: we’re working on the premise that we’re interested in very simple things: invocation times, let’s say, which is probably 90% of the intent when the subject comes up.

First, let’s create a test case. We’re not looking for elegance or validity – we’re just creating code that will run something, anything. First, I’ll create an interface, with two methods:

public interface Thingy {
    List<Integer> getResults();
    void build(int size);
}

Now I’ll implement it with a concrete class:

public class ThingyImpl implements Thingy {
    List<Integer> list = null;

    public List<Integer> getResults() {
        return list;
    }

    public void build(int size) {
        if (list == null) {
            list = new ArrayList<Integer>(size);
        } else {
            list.clear();
        }
        Random r = new Random();
        for (int i = 0; i < size; i++) {
            list.add(r.nextInt());
        }
        Collections.sort(list);
    }
}

This code isn’t, well, rocket surgery, obviously. It’s no great shakes whatsoever. It just builds a list and sorts it, and provides access to the sorted list. It’s a Widget.

So we could find the time it takes to run, say, 1000 of these like this:

Thingy thingy=new ThingyImpl();
long start=System.currentTimeMillis();
for(int i=0;i<1000;i++) {
    thingy.build(1000);
}
long end=System.currentTimeMillis();
long elapsedTime=end-start;

On my system, that takes 318 milliseconds, for an invocation time of 0.318 milliseconds per call. That’s not bad, all things considered. However, the measurement is very coarse. If we tracked the actual elapsed time for each build() call, instead of measuring start and finish times for the entire block, we’d get… a long stream of zeroes, followed by a 7 or an 8 (depending on your granularity, which varies from system to system and OS to OS), followed by a long stream of zeroes again…

So we can’t apply statistics to it, because it’s not accurate in any way. It’s too coarse. Our median value is going to come out a zero. That’s clearly not right; 1000 * 0 is 0, not 318. That’s too great a deviation to be useful.

So we can measure nanoseconds, now, as well, with System.nanoTime(). That’s more useful, but it’s more work.

I used Java’s dynamic proxies to actually capture the timing data in a list. The proxies are pretty simple, but they look worse. Here’s the proxy I used for testing the Thingy:

public class SpeedProxy implements InvocationHandler {
    Object object;
    ExecutorService service = new ScheduledThreadPoolExecutor(20);

    public SpeedProxy(Object object) {
        this.object = object;
    }

    public Object invoke(Object proxy, final Method method, Object[] args) throws Throwable {
        final long startTime = System.nanoTime();
        Object returnValue = method.invoke(object, args);
        final long endTime = System.nanoTime();

        // do SOMETHING, anything, with the time here
        service.submit(new Runnable() {
            public void run() {
                System.out.println("runtime for " +
                        method.getName() + ": " +
                        (endTime - startTime) + "ns "+
                        +(endTime-startTime)/1000000.0+"ms"
                );
            }
        });
        return returnValue;
    }
}

There are a lot of things to look at here. But first! Don’t use this. The way it logs information (through the executor) is insufficient. You will slaughter your performance with this. The actual one I used for the test had an object that captured data with O(1) access time (it used a preallocated array) so it didn’t schedule anything at all. The scheduled executor here is convenient, but will kill your performance.

Don’t say I didn’t warn you.

Anyway, so this is really simple: the invoke() method takes an object (the object the proxy operates on), the method reference itself, and then any arguments. Then it calls the Method reference with the object reference (the instance on which the method is operating) with the arguments.

It surrounds this with longs that represent starting nanotimes and ending nanotimes. Values that come out of this will be in the tens of thousands – but considering an eyeblink takes tens of millions of these, it’s still pretty bloody fast.

And it’s reasonably accurate. (If you need more, use an oscilloscope. Way beyond scope of this post.)

The last thing this does is, of course, to schedule data capture. I did it this way because logging synchronously like this is horribly expensive; I just pushed off the delay and let the method return.

So how do we use this? Well, with more ugly code, of course. Here’s how I got a Thingy reference with the proxy installed:

Thingy thingy2 = (Thingy) Proxy.newProxyInstance(thingy.getClass().getClassLoader(),
                new Class<?>[]{Thingy.class},
                new SpeedProxy(new ThingyImpl()));

All together now: “Ewwwwww.”

However… this means that any call through the Thingy interface gets intercepted and logged (or tracked, or whatever you want to do.)

That calls into question, though: how much time does the proxy itself introduce to the method call? That’s easy: do two tests, one without the proxy (and manually tracking the elapsed nanoseconds) and the other with the proxy in place. The difference will be the time taken by the proxy call.

On my system, it turns out that the direct build() invocation took 302637 nanoseconds (0.302 milliseconds). The proxied version took ever-so-slightly longer, at 324401 nanoseconds – the proxy introduced a 0.02 millisecond difference.

Some notes: yes, I ran a warmup before testing this, so the JIT had had a chance to fix up the runtimes. I also used the results of the build(), so that the JIT couldn’t just discard the result altogether. (Remember, we’re interested in the side-effect, but we don’t want the primary effect to go away.)

So. We’re remembering to use a profiler if we’re really trying to profile and not just track, but what else is there?

Lots of people are now rabidly shaking their heads, saying “Why didn’t he mention…” Well, this section is for you. Sort of. I’m not going to go into a lot of detail; there are too many options.

The Proxy mechanism here is very, very useful. However, it’s also very verbose; that Proxy.newInstance() call is a horror. There should be a better way.

And there is: it’s aspects.

Aspects are, well, hmm. It’s not fair to say that aspects are limited to making this kind of proxy code easier to work with, because aspect-oriented programming can actually do a lot more. However, one use for aspects is to put wrappers before, around, and after method calls, much as we did here (although all we did was “around.”)

It’s worth considering, especially since aspects are how Spring does a lot of its transactional magic. See “Aspect Oriented Programming with Spring” from the Spring 3 docs for more information on this.

Of course, you don’t have to use Spring to use aspects; check out the AspectJ project, cglib, javassist, asm, and… well, these are the libraries that came to mind with no thought. Google might have many others.

Author’s Note: Reposted, but not necessarily updated.

Repost: Rocket Java: Spring Injection of Other Beans’ Properties

Woot, IRC FTW! Someone today asked a question about Spring that was actually relevant for once.

The problem was that he wanted to inject not a bean, but a bean’s referenced property into another bean.

Put more succintly: He had a bean A, with a property B, and didn’t want to inject A into bean C, but wanted to inject the result of A.getB().

This is actually pretty easily done… as long as you’re on the Spring 3.0 revision. The key is the spring-expression module; it lets us, well, use expressions in our Spring configuration.

So let’s build a sample and show this puppy in action. First, we’ll create our two classes, as referenced above, using Lombok to create our mutators and accessors:

public class A {
   @Getter @Setter String b;
}

public class C {
   @Getter @Setter String d;
   public void run() { System.out.printf("%s%n", getD()); }
}

So our goal is to have run() output the value of A.b, even though C has no reference to A.

With spring-expression, the configuration file is really pretty simple. It looks like this:

<beans>
    <bean id="a" class="sandbox.A">
        <property name="b" value="hello, world"/>
    </bean>
    <bean id="c" class="sandbox.C">
        <property name="d" value="#{a.b}"/>
    </bean>
</beans>

The expression is the key. “#{a.b}” means to follow the object hierarchy – including bean references, which is where the “a” comes in – and use property b from reference a.

It’s as simple as pi. (Because, after all, pi is just 3.14159… wait. pi is an irrational number. If only I could make pie!)

Incidentally, as a bit of an aside: I used Lombok up there to add mutators and accessors via annotation, which is really convenient; however, if you use it in IDEA, note that the editor in IDEA isn’t able to infer the new methods, so it shows me lots of red. I really dislike warnings (and hints of warnings) in my code, and errors… horrors! (I’m error-phobic.)

But it runs, folks. I promise.

Author’s Note: Yet another repost.

Repost: Rocket Java: Initializing arrays of arrays

From IRC, from whence many “interesting” questions come:

“If I have a 4 dimensional integer array in Java, is there a way to initialise every element in it to a particular value?”

Well. First off, you don’t have a four-dimensional array. You have a one-dimensional array of one dimensional arrays of one dimensional arrays of one dimensional arrays.

Every one of these arrays have their own length; they’re not uniform. Every one of these arrays have their own reference in the heap; they’re not contiguous.

Therefore, we’ve eliminated C-like memset() calls, in one fell swoop. Swoop, us! Swoop! (Swooping is fun.)

The key here is to think about how arrays like that are even initialized: chances are, with lots of loops.

Let’s say you want a 4x4x4x4 array of integers. Here’s code to initialize one of the endpoints to the default values for an array of ints, zero:

int[][][][] myArray=new int[4][][][];
myArray[0]=new int[4][][];
myArray[0][0]=new int[4][];
myArray[0][0][0]=new int[4]; // we will be playing with this line some.

Now, you can always use a different value if you lose the dimension on the last line. Setting all four values to one would look like this:

myArray[0][0][0]=new int[] {1,1,1,1};

That’s not very flexible, though.

The Java API has Arrays.fill(), which works for multiple primitive types (and Object), but the first argument isn’t an array of arrays – it’s a single-dimensional array. There are forms of the method that take ranges, which is nice, but the version linked to there will copy a value to each element in the list. (By the way, Arrays.fill(Object[], Object) seems faintly useless to me – if only that were a closure!)

Therefore, we could replace that last line, again:

myArray[0][0][0]=new int[4];
java.util.Arrays.fill(myArray[0][0][0], 14);

That’s a little better, but not much.

The bottom lines here: Java doesn’t have multidimensional arrays, so there’s not an “optimization” that leverages contiguous memory available. You can always slice your arrays yourself (by creating int[256], for example, and manually indexing, which would definitely give you regular arrays of arrays, which Java does not offer you), but otherwise… no. No multidimensional arrays.

Since there are no multidimensional arrays (and no contiguous arrays), to fill an array of arrays with a single value, you get to loop. Somewhere. Sorry, folks.

Author’s Note: Another repost.