<< How do I send mail with Java? | Home | How can I automatically expire data from a database? >>

My data structures have changed - how can I migrate my data model?

Sometimes, people design data structures, persisted entities, only to find that - gasp! - their data structures are more fluid than they expected. When all of your data is transient - meaning that it can be destroyed safely - that's no big deal, because you just drop the tables in question (or the whole database) and start over. (Sound familiar, Rails fans?) Sometimes, that's not enough - so what do you do?

Sadly, you generally get to write a migration tool, or use one like openDBcopy. Rails (and its descendants like Grails) have a specific database migration step, but they don't preserve data through migrations if the migration is severe enough.

The problem here is that relational databases are structured. That means that once the structure is firmed up, like fired clay, you really can't change its shape without breaking it to pieces and starting over - if you don't start over, your database starts to look either hopelessly generic (and slow, and non-informative) or it looks like it's an elephant that's had fins and wings bolted on.

It's not pretty.

If you'll pardon the soapbox, there is a solution: JSR-170, the Java Content Repository. JCR stores data without enforcing structure, unless you really want it, and that structure is easily versionable.

Of course, JCR isn't perfect. It's not going to replace a data warehouse (not now, at least), and the unstructured nature means that you end up wanting at least a passing familiarity with XPath - but there's also a huge benefit in being able to truly version your data instead of maintaining an audit log somewhere in your database.

DBAs see this kind of statement as a pestilence upon the earth, because the idea of storing efficient unstructured data basically means that their ivory towers aren't needed and they lose their little kingships.

This is not the last word on this subject. When "one solution" is mentioned, well... here's a secret: there are lots of solutions, nearly all of them causing DBAs heart attacks.

But that probably doesn't help you change your data structures in mid-project, if you've already got data you can't lose. Sorry. :(




Add a comment Send a TrackBack