This library was originally implemented for the DIKU course Statistical Methods for Machine Learning. It contains the following algorithms:
Classification
- K nearest neighbours
- Linear discriminant analysis
- Multilayer perceptron (with backpropagation)
- Naive bayes (Also supporting different kernel estimator, like Epanechnikov and Gauassian kernels)
Preprocessing
- Data normalization (zero mean, unit variance)
- Data rescaling (data rescaled to lie between zero and one)
Regression
- Linear regression
- Linear regression with basis functions
Testing
- k-fold cross validation
- Test set validation
The whole Java source code is available at github. Example usage for most classifiers is available here, and here for the multilayer perceptron.
Note that this library relies on JAMA for matrix operations.
Future plans
- Examples (and other documentation)
- K-means clustering
- Principal component analysis
Changelog
--- 2014/07/01: v0.1
Initial upload