Stochastic Discrimination Utilities
This page will ultimately contain links to various things that I have
made available to the public (usually under the GPL), but right now
in contains only my implementation of Dr. Eugene Kleinberg's Stochastic
Discrimination algorithm ("SDUtils"), available here.
From Dr. Kleinberg's abstract to the paper linked below:
Stochastic discrimination is a general methodology for constructing classifiers appropriate for pattern recognition. It is
based on combining arbitrary numbers of very weak components, which are usually generated by some pseudorandom process, and it
has the property that the very complex and accurate classifiers produced in this way retain the ability, characteristic of their weak
component pieces, to generalize to new data. In fact, it is often observed, in practice, that classifier performance on test sets continues
to rise as more weak components are added, even after performance on training sets seems to have reached a maximum. This is
predicted by the underlying theory, for even though the formal error rate on the training set may have reached a minimum, more
sophisticated measures intrinsic to this method indicate that classifier performance on both training and test sets continues to improve
as complexity increases.
The documentation makes no attempt to describe
the algorithm. Interested parties should start with the papers available on Dr. Kleinberg's own
stochastic discrimination site.
This is the paper that got
me started.
If you wind up using this implementation for something cool, I'd be delighted to hear about it.
You can find my email address in the documentation. Good luck.
- David C. Lambert
11 September 2002