Generate Uninflected Spelling Variants

  • Short Description: Generate known uninflected form spelling variants.

  • Full Description:

    This flow component returns the uninflected spelling variants (base forms).

    The results are sorted twice. First, sort spelling variants and then sort uninflected terms.

    The -m option returns the EUI for the uninflected form in the specific category.

  • Difference:
    1. The table in database has been changed. Accordingly, results are different. The main difference is that uninflected form is separated from infinitive, present, positive, etc.

    2. The result in C version concatenates categories if the output terms are the same. In other words, one output may include several categories. Thus, EUI is not unique if -m option is used.

    3. The -m option may not find the EUI since GetEui is case sensitive by uninflected term (need one more field of uninflected term in DB).


  • Features:
    1. Generate the uninflected form for all spelling variants.


  • Symbol: e

  • Examples:
    
    shell> lvg -f:e -m
    coloring
    coloring|color|1024|1|e|1|E0017903|
    coloring|colour|1024|1|e|1|E0017903|
    
    resume
    resume|resume|128|1|e|1|E0053099|
    resume|resume|1024|1|e|1|E0053098|
    resume|resumé|128|1|e|1|E0053099|
    resume|résumé|128|1|e|1|E0053099|
    
    ozena
    ozena|ozaena|128|1|e|1|E0044939|
    ozena|ozena|128|1|e|1|E0044939|
    ozena|ozoena|128|1|e|1|E0044939
    
    More examples

  • Implementation Logic:
    1. Generate spelling variants (first sort).
    2. Uninflect all spelling variants (second sort).
    3. Retrieve EUI for each uninflected form of all spelling variants.
    4. Remove duplicated output LexItems

  • Source Code: ToBaseSpellingVariant.java

  • Hierarchy: Object -> Transformation -> ToBaseSpellingVariants