ca.uottawa.balie
Class LexiconOnDisk

java.lang.Object
  extended by ca.uottawa.balie.LexiconOnDisk
All Implemented Interfaces:
LexiconOnDiskI, java.io.Serializable

public class LexiconOnDisk
extends java.lang.Object
implements LexiconOnDiskI, java.io.Serializable

Lexicon loader for baseline NER

Author:
nadeaud
See Also:
Serialized Form

Nested Class Summary
static class LexiconOnDisk.Lexicon
          Choice of lexicons
 
Constructor Summary
LexiconOnDisk(LexiconOnDisk.Lexicon pi_LexisonSet)
          Read lexicons that reside in text files.
 
Method Summary
 NamedEntityType GetEntityType(java.lang.String pi_Word, boolean pi_bFuzzy)
          Get the entity type for this word.
 NamedEntityType GetEntityTypeForAllFuzzyVariants(java.lang.String pi_Word)
          Get the entity type for this word and all its fuzzy variants.
 boolean IsEntity(java.lang.String pi_Word, boolean pi_bFuzzy)
          Check if a given word is listed in a lexicon
 NamedEntityTypeEnumI[] Types()
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LexiconOnDisk

public LexiconOnDisk(LexiconOnDisk.Lexicon pi_LexisonSet)
Read lexicons that reside in text files.

Parameters:
pi_LexisonSet -
Method Detail

IsEntity

public boolean IsEntity(java.lang.String pi_Word,
                        boolean pi_bFuzzy)
Check if a given word is listed in a lexicon

Specified by:
IsEntity in interface LexiconOnDiskI
Parameters:
pi_Word - a word (can be a phase)
Returns:
True if the word belogns to one or more lexicon

GetEntityType

public NamedEntityType GetEntityType(java.lang.String pi_Word,
                                     boolean pi_bFuzzy)
Get the entity type for this word. Entity type is a bit flag made of one or many entity types (see TokenConsts for type list)

Specified by:
GetEntityType in interface LexiconOnDiskI
Parameters:
pi_Word -
Returns:
the type(s)
See Also:
TokenConsts

GetEntityTypeForAllFuzzyVariants

public NamedEntityType GetEntityTypeForAllFuzzyVariants(java.lang.String pi_Word)
Get the entity type for this word and all its fuzzy variants.

Parameters:
pi_Word -
Returns:
the type(s) of this words and all its variants.

Types

public NamedEntityTypeEnumI[] Types()
Specified by:
Types in interface LexiconOnDiskI