<< Login walls blow - but not having them is even worse | Home | Father's Day >>

Bayes for Java

Okay, fine, I'll write something about ci-bayes

I have a library for doing bayesian analysis, for Java, called ci-bayes. It's a working project, it's pretty fast, and I think it's better than the alternatives for Java.

So what it is, in short: It's a classifier. You train it by teaching it that "X" looks like it should be classified as "Y," and "Z" should be classified as "A" -- and if you pass it a block of text, like "now is the time for all good men..." it will determine to the best of its ability which classification applies, and tell you that.

Where it differs from things like Classifier4J is that Classifier4J tells you if something is spam or ham -- i.e., it's an all-on, all-off classifier. ci-bayes is able to tell you what classification applies from a group of as many classifications as you like - and it can tell you how strong each classification was, too, so you can see how it graded everything.

(To be honest, though, the numbers are only worthwhile in relation to each other - the internal grading is the result of a lot of math, and you tend to get very small numbers.)

ci-bayes was originally ported directly from Python from Tobey Segaran's book, "Programming Collective Intelligence," which is where the "ci" in the name came from. As I've worked with it, though, it's been tuned and changed to add features based on my needs, so it's no longer a direct port - and it's much faster than it was, too.



Add a comment Send a TrackBack