≡ Menu

Repost: Adding high availability in Terracotta DSO

Terracotta DSO is a package for distributing references in a heap across virtual machines. (Thus: Java. I thought it included C#, but Geert Bevin reminded me that I’m an idiot.) That means that if you have a Map, for example, you can set it to be shared, and your application can share it with other VMs without having to be coded for that purpose.

The only real requirement on the part of your code is that it be written correctly. As in, really correctly. (Want help with this? Check out Using Terracotta for Configuration Management, an article I helped write for TheServerSide.com.)

Luckily, DSO helps you do that by pointing out failures when you screw up.

Anyway, the way DSO works topologically is through a hub and spoke architecture. That means that your VMs are clients, tied to a central server. The hub, then, might seem a potential failure point; if your hub goes down, your clients stop working.

That would be bad. Luckily, DSO has availability options to change the nature of what the “hub” looks like.

The DSO client uses a configuration file named by default as “tc-config.xml.” It has a reference to a server instance that looks something like this:

<servers>
  <server host="localhost">
    <data>%(user.home)/terracotta/server-data</data>
    <logs>%(user.home)/terracotta/server-logs</logs>
  </server>
</servers>

Note the server hostname: it’s important, surprisingly enough. Adding high availability to DSO is only a matter of changing this block in your configuration file.

What we’re configuring here is active/passive failover. That means there’s a primary server instance and a passive server instance. The two server instances sync up, so if the primary goes down, the passive has all of its data; the clients are perfectly able to switch from using one server to the other as their active status changes, so from the client perspective, if the primary server dies, nothing happens.

When the former primary instance returns, well, it’s not the primary any more; it becomes the passive server instance. From the client perspective, all of this is under the covers.

This is Very Good.

So: here’s a configuration I used (tested with Vista running one DSO server with ip 192.168.1.106, running the other DSO server under a Windows XP VMWare image with ip 192.168.1.132):

<servers>
  <server host="192.168.1.106" name="m1">
    <data>%(user.home)/terracotta1/server-data</data>
    <logs>%(user.home)/terracotta1/server-logs</logs>
    <dso>
      <persistence>
        <mode>permanent-store</mode>
      </persistence>
    </dso>
  </server>
  <server host="192.168.1.132" name="m2">
    <data>%(user.home)/terracotta2/server-data</data>
    <logs>%(user.home)/terracotta2/server-logs</logs>
    <dso>
      <persistence>
        <mode>permanent-store</mode>
      </persistence>
    </dso>
  </server>
</servers>

When you start the server, you use a command line argument, like this, with the configuration file in $TERRACOTTA/bin:

./start-tc-server.sh -n m1

This tells the server instance to use the configuration “m1” – yes, I know, but I didn’t have hostnames setup and I wasn’t sure how NAT would affect what I was doing – and the two instances will work out between them which is active and which is passive.

The end result is that I was able to run my shared map program while changing the servers around – starting and stopping them in-process, for example – without the client program knowing or caring. (Note that the client program needs the same basic configuration data; it can get this by having its own copy of the configuration file, which is what I did, or it can load it from the servers, too, or any number of variants. It just needs the relevant data. Happy now, Geert?)

A side-benefit of the configuration is that the DSO cache is persistent – you don’t need high availability for persistency, but you do need persistency for high availability.

It’s good stuff.

BTW – a total aside, but the hub and spoke architecture of DSO is not the only network topology available. Systems like GigaSpaces use a peer architecture and a different lookup mechanism.

Clients are in one of two modes, where I’ll use my own names for things: pure clients, and participants. A pure client is one like the DSO client: something that connects to an external resource.

A participant client is actually running a GigaSpace itself, and is part of the cloud itself; the resources the participant client has become part of the entire GigaSpace, and it can hold data itself, pick up tasks, participate in service level enforcement, etc.

Participant clients are awesome. They’re also harder to document, so expect me to talk more about them sometime soon.

The way that servers are located by either client in a GigaSpace is different from DSO, too. DSO has a list of servers; a GigaSpace can use a server name, but it can also find servers itself through JINI. This means it can transparently add servers to the cloud without client configuration (in any way) or server configuration.

Which one is better? Well… that’s up to you. Personally, I find the GigaSpaces approach to be more powerful and more useful. That said, it’s more invasive and alters your architecture. The alteration is for the better, in my experience, but changing architecture is a scary proposition; DSO allows you to distribute your data without invasive coding at all, and that can be horribly useful.

Author’s Note: This is one of the old blog posts I decided to rescue from the migration to my new site. I don’t remember when it was originally posted. Anyway, it had some interesting things I thought were worth saving.

{ 0 comments… add one }

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.