ca.uottawa.balie
Class Balie

java.lang.Object
  extended by ca.uottawa.balie.Balie

public class Balie
extends java.lang.Object

This is the main entry point for training Balie. Most constants are grouped in this class. Verbose flags are grouped in this class.

Author:
nadeaud

Field Summary
static boolean ALLOW_FUZZY_MATCH
           
static boolean DEBUG_ABBREVIATION_LOOKUP
           
static boolean DEBUG_LANGUAGE_IDENTIFICATION
           
static boolean DEBUG_LIGATURE
           
static boolean DEBUG_NAMED_ENTITY_RECOGNITION
           
static boolean DEBUG_NAMED_ENTITY_RECOGNITION_FULL
           
static boolean DEBUG_PRINT_SBD_TEST_CORPUS
           
static boolean DEBUG_PUNCT_LOOKUP
           
static boolean DEBUG_TOKEN
           
static boolean DEBUG_TOKENIZER
           
static boolean DEBUG_UNBREAKABLE_LOOKUP
           
static java.lang.String ENCODING_DEFAULT
           
static java.lang.String ENCODING_LITTLE_INDIAN
           
static java.lang.String ENCODING_UTF8
           
static java.lang.String ENGLISH_TOKEN_LIST_ON_DISK
           
static java.lang.String FRENCH_TOKEN_LIST_ON_DISK
           
static java.lang.String GERMAN_TOKEN_LIST_ON_DISK
           
static java.lang.String ITALIAN_TOKEN_LIST_ON_DISK
           
static java.lang.String LANGUAGE_ENGLISH
           
static java.lang.String LANGUAGE_FRENCH
           
static java.lang.String LANGUAGE_GERMAN
           
static java.lang.String LANGUAGE_ID_MODEL
           
static java.lang.String LANGUAGE_ID_TESTING_CORPUS
           
static java.lang.String LANGUAGE_ID_TRAINING_CORPUS
           
static java.lang.String LANGUAGE_ITALIAN
           
static java.lang.String LANGUAGE_ROMANIAN
           
static java.lang.String LANGUAGE_SPANISH
           
static java.lang.String LANGUAGE_UNKNOWN
           
static int MAX_NUMBER_FUZZY_VARIANTS
           
static int MAX_TOKEN_PER_ENTITY
           
static int MIN_SIZE_FOR_FUZZY_MATCH
           
static java.lang.String OUT_LI_TEST_MODEL
           
static java.lang.String OUT_LI_TRAIN_MODEL
           
static java.lang.String OUT_SBD_TEST_MODEL
           
static java.lang.String OUT_SBD_TRAIN_MODEL
           
static java.lang.String ROMANIAN_TOKEN_LIST_ON_DISK
           
static java.lang.String SBR_MODEL
           
static java.lang.String SBR_TESTING_CORPUS_PC
           
static java.lang.String SBR_TRAINING_CORPUS_PC
           
static java.lang.String SPANISH_TOKEN_LIST_ON_DISK
           
static java.lang.String UNBREAK_TOKEN_LIST_ON_DISK
           
 
Constructor Summary
Balie()
           
 
Method Summary
static void main(java.lang.String[] args)
          The main allows to execute Balie and request the training of individual or all modules.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LANGUAGE_UNKNOWN

public static final java.lang.String LANGUAGE_UNKNOWN
See Also:
Constant Field Values

LANGUAGE_ENGLISH

public static final java.lang.String LANGUAGE_ENGLISH
See Also:
Constant Field Values

LANGUAGE_FRENCH

public static final java.lang.String LANGUAGE_FRENCH
See Also:
Constant Field Values

LANGUAGE_SPANISH

public static final java.lang.String LANGUAGE_SPANISH
See Also:
Constant Field Values

LANGUAGE_GERMAN

public static final java.lang.String LANGUAGE_GERMAN
See Also:
Constant Field Values

LANGUAGE_ROMANIAN

public static final java.lang.String LANGUAGE_ROMANIAN
See Also:
Constant Field Values

LANGUAGE_ITALIAN

public static final java.lang.String LANGUAGE_ITALIAN
See Also:
Constant Field Values

LANGUAGE_ID_TRAINING_CORPUS

public static final java.lang.String LANGUAGE_ID_TRAINING_CORPUS
See Also:
Constant Field Values

LANGUAGE_ID_TESTING_CORPUS

public static final java.lang.String LANGUAGE_ID_TESTING_CORPUS
See Also:
Constant Field Values

OUT_LI_TRAIN_MODEL

public static final java.lang.String OUT_LI_TRAIN_MODEL
See Also:
Constant Field Values

OUT_LI_TEST_MODEL

public static final java.lang.String OUT_LI_TEST_MODEL
See Also:
Constant Field Values

LANGUAGE_ID_MODEL

public static final java.lang.String LANGUAGE_ID_MODEL
See Also:
Constant Field Values

SBR_TRAINING_CORPUS_PC

public static final java.lang.String SBR_TRAINING_CORPUS_PC
See Also:
Constant Field Values

SBR_TESTING_CORPUS_PC

public static final java.lang.String SBR_TESTING_CORPUS_PC
See Also:
Constant Field Values

OUT_SBD_TRAIN_MODEL

public static final java.lang.String OUT_SBD_TRAIN_MODEL
See Also:
Constant Field Values

OUT_SBD_TEST_MODEL

public static final java.lang.String OUT_SBD_TEST_MODEL
See Also:
Constant Field Values

SBR_MODEL

public static final java.lang.String SBR_MODEL
See Also:
Constant Field Values

ENGLISH_TOKEN_LIST_ON_DISK

public static final java.lang.String ENGLISH_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

GERMAN_TOKEN_LIST_ON_DISK

public static final java.lang.String GERMAN_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

FRENCH_TOKEN_LIST_ON_DISK

public static final java.lang.String FRENCH_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

SPANISH_TOKEN_LIST_ON_DISK

public static final java.lang.String SPANISH_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

ROMANIAN_TOKEN_LIST_ON_DISK

public static final java.lang.String ROMANIAN_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

ITALIAN_TOKEN_LIST_ON_DISK

public static final java.lang.String ITALIAN_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

UNBREAK_TOKEN_LIST_ON_DISK

public static final java.lang.String UNBREAK_TOKEN_LIST_ON_DISK
See Also:
Constant Field Values

ENCODING_DEFAULT

public static final java.lang.String ENCODING_DEFAULT
See Also:
Constant Field Values

ENCODING_UTF8

public static final java.lang.String ENCODING_UTF8
See Also:
Constant Field Values

ENCODING_LITTLE_INDIAN

public static final java.lang.String ENCODING_LITTLE_INDIAN
See Also:
Constant Field Values

ALLOW_FUZZY_MATCH

public static final boolean ALLOW_FUZZY_MATCH
See Also:
Constant Field Values

MIN_SIZE_FOR_FUZZY_MATCH

public static final int MIN_SIZE_FOR_FUZZY_MATCH
See Also:
Constant Field Values

MAX_NUMBER_FUZZY_VARIANTS

public static final int MAX_NUMBER_FUZZY_VARIANTS
See Also:
Constant Field Values

MAX_TOKEN_PER_ENTITY

public static final int MAX_TOKEN_PER_ENTITY
See Also:
Constant Field Values

DEBUG_TOKENIZER

public static final boolean DEBUG_TOKENIZER
See Also:
Constant Field Values

DEBUG_TOKEN

public static final boolean DEBUG_TOKEN
See Also:
Constant Field Values

DEBUG_LANGUAGE_IDENTIFICATION

public static final boolean DEBUG_LANGUAGE_IDENTIFICATION
See Also:
Constant Field Values

DEBUG_PRINT_SBD_TEST_CORPUS

public static final boolean DEBUG_PRINT_SBD_TEST_CORPUS
See Also:
Constant Field Values

DEBUG_PUNCT_LOOKUP

public static final boolean DEBUG_PUNCT_LOOKUP
See Also:
Constant Field Values

DEBUG_UNBREAKABLE_LOOKUP

public static final boolean DEBUG_UNBREAKABLE_LOOKUP
See Also:
Constant Field Values

DEBUG_ABBREVIATION_LOOKUP

public static final boolean DEBUG_ABBREVIATION_LOOKUP
See Also:
Constant Field Values

DEBUG_LIGATURE

public static final boolean DEBUG_LIGATURE
See Also:
Constant Field Values

DEBUG_NAMED_ENTITY_RECOGNITION

public static final boolean DEBUG_NAMED_ENTITY_RECOGNITION
See Also:
Constant Field Values

DEBUG_NAMED_ENTITY_RECOGNITION_FULL

public static final boolean DEBUG_NAMED_ENTITY_RECOGNITION_FULL
See Also:
Constant Field Values
Constructor Detail

Balie

public Balie()
Method Detail

main

public static void main(java.lang.String[] args)
The main allows to execute Balie and request the training of individual or all modules.

Parameters:
args - Either -trainlangid, -trainsbd or -trainall