SentenceBoundariesRecognition

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

ca.uottawa.balie
Class SentenceBoundariesRecognition

java.lang.Object
  ca.uottawa.balie.SentenceBoundariesRecognition

public class SentenceBoundariesRecognition
extends java.lang.Object
extends java.lang.Object

Methods for training, testing and using sentence boundary recognition. This class needs to be redesigned to improve usability. In its current form, it requires the following manipulations to be used correctly.

First, it operates on a TokenList
The caller module must obtain the model: GetModel()
It must operated by examining every token and keeping a look ahead of 1 token
The method IsSentenceBoundary() must be called on each token
When a sentence break is found, the method YieldSentenceBeginning() must be called.

Author:: nadeaud

Constructor Summary
`SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)` Initialize SBD algorithm.

Method Summary
`static WekaLearner`	`GetModel()` Gets the learned model for SBR.
`boolean`	`IsSentenceBoundary(WekaLearner pi_Model, Token pi_SentenceBeginning, Token pi_LastToken, Token pi_CurrentToken, Token pi_NextToken)` Check if a given token sequence is a sentence break (located after the "current" token)
`static void`	`main(java.lang.String[] args)`
`boolean`	`SentenceIsAllCapitalized()` Check if a sentence is all capitalized
`static void`	`TrainSentenceBoundariesRecognition()` Trains and Tests the sentence boundary recognition model
`void`	`YieldSentenceBeginning()` This method should be called each time a sentence break is found

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

SentenceBoundariesRecognition

public SentenceBoundariesRecognition(LanguageSpecific pi_LanguageSpecific)

Initialize SBD algorithm. Can be use in training, testing or use modes.

Parameters:: pi_LanguageSpecific -

Method Detail

GetModel

public static WekaLearner GetModel()

Gets the learned model for SBR.

Returns:: the Model (WekaLearner)

SentenceIsAllCapitalized

public boolean SentenceIsAllCapitalized()

Check if a sentence is all capitalized

Returns:: true if all cap

IsSentenceBoundary

public boolean IsSentenceBoundary(WekaLearner pi_Model,
                                  Token pi_SentenceBeginning,
                                  Token pi_LastToken,
                                  Token pi_CurrentToken,
                                  Token pi_NextToken)

Check if a given token sequence is a sentence break (located after the "current" token)

Parameters:: pi_Model - SBD model (use the GetModel() method); pi_SentenceBeginning - The first token of the sentence (use the first token of the text for the first sentence); pi_LastToken - The previous token (i-1); pi_CurrentToken - The current token under examination (i); pi_NextToken - The next token in the tokenlist (i+1)
Returns:: True if there is a sentence break after the "current" token

YieldSentenceBeginning

public void YieldSentenceBeginning()

This method should be called each time a sentence break is found

TrainSentenceBoundariesRecognition

public static void TrainSentenceBoundariesRecognition()

Trains and Tests the sentence boundary recognition model

main

public static void main(java.lang.String[] args)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

ca.uottawa.balie Class SentenceBoundariesRecognition

SentenceBoundariesRecognition

GetModel

SentenceIsAllCapitalized

IsSentenceBoundary

YieldSentenceBeginning

TrainSentenceBoundariesRecognition

main

ca.uottawa.balie
Class SentenceBoundariesRecognition