ca.uottawa.balie
Class SentenceBoundariesRecognition

java.lang.Object
  extended by ca.uottawa.balie.SentenceBoundariesRecognition

public class SentenceBoundariesRecognition
extends java.lang.Object

Methods for training, testing and using sentence boundary recognition. This class needs to be redesigned to improve usability. In its current form, it requires the following manipulations to be used correctly.

Author:
nadeaud

Constructor Summary
SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)
          Initialize SBD algorithm.
 
Method Summary
static WekaLearner GetModel()
          Gets the learned model for SBR.
 boolean IsSentenceBoundary(WekaLearner pi_Model, Token pi_SentenceBeginning, Token pi_LastToken, Token pi_CurrentToken, Token pi_NextToken)
          Check if a given token sequence is a sentence break (located after the "current" token)
static void main(java.lang.String[] args)
           
 boolean SentenceIsAllCapitalized()
          Check if a sentence is all capitalized
static void TrainSentenceBoundariesRecognition()
          Trains and Tests the sentence boundary recognition model
 void YieldSentenceBeginning()
          This method should be called each time a sentence break is found
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SentenceBoundariesRecognition

public SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)
Initialize SBD algorithm. Can be use in training, testing or use modes.

Parameters:
pi_LanguageSpecific -
Method Detail

GetModel

public static WekaLearner GetModel()
Gets the learned model for SBR.

Returns:
the Model (WekaLearner)

SentenceIsAllCapitalized

public boolean SentenceIsAllCapitalized()
Check if a sentence is all capitalized

Returns:
true if all cap

IsSentenceBoundary

public boolean IsSentenceBoundary(WekaLearner pi_Model,
                                  Token pi_SentenceBeginning,
                                  Token pi_LastToken,
                                  Token pi_CurrentToken,
                                  Token pi_NextToken)
Check if a given token sequence is a sentence break (located after the "current" token)

Parameters:
pi_Model - SBD model (use the GetModel() method)
pi_SentenceBeginning - The first token of the sentence (use the first token of the text for the first sentence)
pi_LastToken - The previous token (i-1)
pi_CurrentToken - The current token under examination (i)
pi_NextToken - The next token in the tokenlist (i+1)
Returns:
True if there is a sentence break after the "current" token

YieldSentenceBeginning

public void YieldSentenceBeginning()
This method should be called each time a sentence break is found


TrainSentenceBoundariesRecognition

public static void TrainSentenceBoundariesRecognition()
Trains and Tests the sentence boundary recognition model


main

public static void main(java.lang.String[] args)