Repost: Fun with XStream

XStream unmarshalling is great fun when you’re not working with a fixed schema.

I’ve been working on a quick start document for GigaSpaces’ data grid edition lately, and I’m doing it with the code in the form of tests. This makes writing it really easy (run the tests, make sure it works, if it fails, wash, rinse, repeat), but that’s not what this post is about.

For each test, I clear out the data grid, and then populate it; I then act on the grid in various ways to show operations.

One of the operations I’m testing is a query-by-example facility, where you create an example object (using null for wildcards by default), populating a few fields, then ask the data grid to hand back all matching objects.

However, this means the object hierarchies have to be somewhat similar, shall we say. If you have two branches of objects, it doesn’t work.

So here’s what I have:

a POJO with a few fields (id, first name, last name, a list of addresses).
a document (a type of Map) with accessors and mutators that modify and query the map, so setFirstName(String name) just performs this.setProperty("firstName", name);. Exciting, exciting, I know - but the documents are schemaless, so theoretically (and, well, in real life) I can dynamically add properties to the type at will.

Nothing too spectacular here - except these are different object trees (DocumentContact has no relation to NativeContact, although they both implement Contact). When I was loading the data, I loaded the items as NativeContacts, because it was simple.

So what I needed to do was clear: load the contacts as DocumentContacts instead of NativeContacts.

At first, I thought, “Oh, that’s simple: instead of aliasing the ‘contact’ node to NativeContact, alias it to DocumentContact.”

This failed in neat and dramatic ways.

The reason: XStream is really quite good at converting streams of text to objects and back, and it does a very complete conversion… by using direct field access. So it doesn’t use accessors to do getFirstName() or setFirstName() - it goes straight to the firstName attribute itself. (This is how it can modify read-only properties, for example.)

But with the DocumentContact… there are no fields. It’s a map. So XStream barfs, because it can’t find attributes that map to the input.

The solution, as the XStream gurus are now screaming at me, is to use a Converter. A converter allows me to override XStream’s default resolution mechanisms with my own, so I can use method resolution if I want to. So here’s what I ended up with in my converter:

@Override
    public Object unmarshal(HierarchicalStreamReader reader,
                            UnmarshallingContext unmarshallingContext)
            throws Error {
        try {
            DocumentContact c = new DocumentContact();

            do {
                reader.moveDown();
                if (reader.getNodeName().equals("addresses")) {
                    List <Address> addresses =
                            (List<Address>) unmarshallingContext
                               .convertAnother(c, List.class);
                    c.setAddresses(addresses);
                } else {
                    String methodName = "set" +
                            reader.getNodeName().toUpperCase().charAt(0) +
                            reader.getNodeName().substring(1);
                    Method m = aClass.getMethod(methodName, String.class);
                    m.invoke(c, reader.getValue());
                }
                reader.moveUp();
            } while (reader.hasMoreChildren());
            return c;
        } catch (Exception e) {
            throw new Error();
        }
    }

Now, before you decide to rail at me, let’s look at the flaws in this code, because there are many.

Error checking! The actual code I use does things a little better, shall we say. The error checking here is simplified or eliminated altogether because it's ginormous.
Performance. No caching, no memoization. In real production code, I'd be saving off all those calculated fields (methodName, m), and might even memoize a storage mechanism (strategy pattern for different field types, for example, since here we have an embedded collection). However: test code. Simple objects with either String attributes or a List of addresses. Considering the scope of the problem, a direct (and slow) solution is satisfactory.

But this code is indeed able to now repopulate my schemaless contacts (from a document created from contacts with a schema.)

It’s not earth-shattering, but I figured someone else might have similar issues with XStream in a similar schemaless environment (consider JCR or, of course, our Data Grid) and this might be helpful to show a solution.

Author’s note: repost.