ca.uottawa.balie
Class WekaLearner

java.lang.Object
  extended by ca.uottawa.balie.WekaLearner
All Implemented Interfaces:
java.io.Serializable

public class WekaLearner
extends java.lang.Object
implements java.io.Serializable

Methods to create, train and test a classification algorithm.

Author:
nadeaud
See Also:
Serialized Form

Constructor Summary
WekaLearner(weka.core.FastVector attrsMerged, java.lang.String[] attrlblMerged, java.lang.String[] classList, weka.core.Instances trainMerged)
           
WekaLearner(WekaAttribute[] pi_Attributes, java.lang.String[] pi_ClassAttributes)
          Creates a new classification algorithm.
 
Method Summary
 void AddTestInstance(java.lang.Object[] pi_Instance, java.lang.String pi_Class)
          Adds an instance in the test set.
 void AddTrainInstance(java.lang.Object[] pi_Instance, java.lang.String pi_Class)
          Adds an instance in the train set.
 void AddUnlabeledTrainInstance(java.lang.Object[] pi_Instance)
           
 double Classify(java.lang.Object[] pi_Instance)
          Classify an unseen instance using the learned classifier.
 double Cluster(java.lang.Object[] pi_Instance)
          Cluster an unseen instance using the learned clusterizer.
 double[][] ConfusionMatrix()
          Gets the confusion matrix that plots precision and recall for each class.
 void CreateModel(weka.classifiers.Classifier pi_Classifier)
          Creates the classification model by learning from the training set.
 void CreateModel(weka.clusterers.Clusterer pi_Clusterer)
          Creates the cluster model by learning from the training set.
 weka.core.Instance CreateUnlabeledInstance(java.lang.Object[] pi_Instance)
           
 weka.classifiers.Evaluation EstimateConfidence()
          Approximate training set error.
 weka.core.FastVector GetAttribute()
           
 java.lang.String[] GetAttributeList()
          Gets the list of attributes.
 java.lang.String[] GetClassList()
          Gets the list of classes.
 weka.core.Instances GetClusterCentroid()
           
 double[] GetDistribution(java.lang.Object[] pi_Instance)
          Classify an unseen instance using the learned classifier.
 weka.core.Instances GetTestingSet()
          Get the training instances
 weka.core.Instances GetTrainingSet()
          Get the training instances
 int GetTrainingSetSize()
          Get the number of training instances
 double Likelihood(java.lang.Object[] pi_Instance, int pi_PositiveClassIndex)
          Classify an unseen instance using the learned classifier.
static void main(java.lang.String[] args)
          Test routine
 void SetDoubleOnly(boolean pi_Flag)
           
 void Shrink()
          Reduces the size of a classifier by deleting the corpora.
 java.lang.String TestModel()
          Test the learned model.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WekaLearner

public WekaLearner(WekaAttribute[] pi_Attributes,
                   java.lang.String[] pi_ClassAttributes)
Creates a new classification algorithm.

Parameters:
pi_Attributes - Array of attributes
pi_ClassAttributes - Class attribute

WekaLearner

public WekaLearner(weka.core.FastVector attrsMerged,
                   java.lang.String[] attrlblMerged,
                   java.lang.String[] classList,
                   weka.core.Instances trainMerged)
Method Detail

AddTrainInstance

public void AddTrainInstance(java.lang.Object[] pi_Instance,
                             java.lang.String pi_Class)
Adds an instance in the train set.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
pi_Class - The class of this instance
See Also:
WekaAttribute

AddUnlabeledTrainInstance

public void AddUnlabeledTrainInstance(java.lang.Object[] pi_Instance)

AddTestInstance

public void AddTestInstance(java.lang.Object[] pi_Instance,
                            java.lang.String pi_Class)
Adds an instance in the test set.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
pi_Class - The class of this instance
See Also:
WekaAttribute

SetDoubleOnly

public void SetDoubleOnly(boolean pi_Flag)

CreateUnlabeledInstance

public weka.core.Instance CreateUnlabeledInstance(java.lang.Object[] pi_Instance)

CreateModel

public void CreateModel(weka.classifiers.Classifier pi_Classifier)
Creates the classification model by learning from the training set.

Parameters:
pi_Classifier - The classification algorithm (from the Weka library, ex.: NaiveBayes, J48, ...)

CreateModel

public void CreateModel(weka.clusterers.Clusterer pi_Clusterer)
Creates the cluster model by learning from the training set.

Parameters:
pi_Clusterer - The clusterization algorithm (from the Weka library, ex.: k-Mean, ...)

EstimateConfidence

public weka.classifiers.Evaluation EstimateConfidence()
Approximate training set error.

Returns:
evaluation module from which many types of errors are exposed (e.g.: mean absolute error)

TestModel

public java.lang.String TestModel()
Test the learned model.

Returns:
A summary string of the performance of the classifier

ConfusionMatrix

public double[][] ConfusionMatrix()
Gets the confusion matrix that plots precision and recall for each class.

Returns:
2-dimension array (X=Ground thruth, Y=Guesses)

Classify

public double Classify(java.lang.Object[] pi_Instance)
Classify an unseen instance using the learned classifier. Output the most probable class.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
Returns:
The index of the most probable class
See Also:
WekaAttribute

Cluster

public double Cluster(java.lang.Object[] pi_Instance)
Cluster an unseen instance using the learned clusterizer. Output the most probable cluster.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
Returns:
The index of the most probable cluster
See Also:
WekaAttribute

GetDistribution

public double[] GetDistribution(java.lang.Object[] pi_Instance)
Classify an unseen instance using the learned classifier. Ouput the likelihood of each class.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
Returns:
An array of probabilities, parallel to the possible classes
See Also:
WekaAttribute

Likelihood

public double Likelihood(java.lang.Object[] pi_Instance,
                         int pi_PositiveClassIndex)
Classify an unseen instance using the learned classifier. Get the likelihood of a given class.

Parameters:
pi_Instance - The instance, an array of objects (can mix numeric and nominal attributes - see WekaAttribute)
pi_PositiveClassIndex - The index of a class
Returns:
the likelihood of the given class
See Also:
WekaAttribute

GetAttributeList

public java.lang.String[] GetAttributeList()
Gets the list of attributes.

Returns:
String array

GetAttribute

public weka.core.FastVector GetAttribute()

GetClassList

public java.lang.String[] GetClassList()
Gets the list of classes.

Returns:
String array

GetTrainingSet

public weka.core.Instances GetTrainingSet()
Get the training instances

Returns:
training instances

GetTestingSet

public weka.core.Instances GetTestingSet()
Get the training instances

Returns:
training instances

GetTrainingSetSize

public int GetTrainingSetSize()
Get the number of training instances

Returns:
training set size

GetClusterCentroid

public weka.core.Instances GetClusterCentroid()

Shrink

public void Shrink()
Reduces the size of a classifier by deleting the corpora. Only keep the learned model and information on classes and attributes.


main

public static void main(java.lang.String[] args)
Test routine

Parameters:
args -