|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectca.uottawa.balie.SentenceBoundariesRecognition
public class SentenceBoundariesRecognition
Methods for training, testing and using sentence boundary recognition. This class needs to be redesigned to improve usability. In its current form, it requires the following manipulations to be used correctly.
GetModel()
IsSentenceBoundary()
must be called on each tokenYieldSentenceBeginning()
must be called.
Constructor Summary | |
---|---|
SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)
Initialize SBD algorithm. |
Method Summary | |
---|---|
static WekaLearner |
GetModel()
Gets the learned model for SBR. |
boolean |
IsSentenceBoundary(WekaLearner pi_Model,
Token pi_SentenceBeginning,
Token pi_LastToken,
Token pi_CurrentToken,
Token pi_NextToken)
Check if a given token sequence is a sentence break (located after the "current" token) |
static void |
main(java.lang.String[] args)
|
boolean |
SentenceIsAllCapitalized()
Check if a sentence is all capitalized |
static void |
TrainSentenceBoundariesRecognition()
Trains and Tests the sentence boundary recognition model |
void |
YieldSentenceBeginning()
This method should be called each time a sentence break is found |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)
pi_LanguageSpecific
- Method Detail |
---|
public static WekaLearner GetModel()
WekaLearner
)public boolean SentenceIsAllCapitalized()
public boolean IsSentenceBoundary(WekaLearner pi_Model, Token pi_SentenceBeginning, Token pi_LastToken, Token pi_CurrentToken, Token pi_NextToken)
pi_Model
- SBD model (use the GetModel() method)pi_SentenceBeginning
- The first token of the sentence (use the first token of the text for the first sentence)pi_LastToken
- The previous token (i-1)pi_CurrentToken
- The current token under examination (i)pi_NextToken
- The next token in the tokenlist (i+1)
public void YieldSentenceBeginning()
public static void TrainSentenceBoundariesRecognition()
public static void main(java.lang.String[] args)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |