Filter out acronyms and abbreviations - (known to the lexicon) from the output

  • Short Description: Filter out acronyms and abbreviations (known to the lexicon) from the output.

  • Full Description:

    This is an useful filter to apply before other morphological mutations are applied. The reasoning is that it is probably unwise to apply any morphological mutation to acronyms and abbreviations because the mutation is likely to produce a wrong, misleading or spurious term.

    Only one output record for one input term.

    No effect on the -m option. "none" is added at the end of the output.

  • Difference: None

  • Features:
    1. Filter out if the input term is a known acronym.
    2. All punctuations and cases of the input term are ignored.


  • Symbol: fa

  • Examples:
    
    shell> lvg -f:fa -n
    WACTA
    WACTA|WACTA|2047|16777215|fa|1|
    
    ABBA
    ABBA|-No Output-
    
    AIDS
    AIDS|-No Output-
    
    COLD
    COLD|-No Output-
    
    More example

  • Implementation Logic:
    1. Strip punctuations from the input term.
    2. Lowercase the input term.
    3. Performed a SQL query and check if the input term is an acronym.
    4. If yes, filter out the input term.

  • Source Code: ToFilterAcronym.java

  • Hierarchy: Object -> Transformation -> ToFilterAcronym