Normalize Unicode with Synonym Option
- Short Description: Normalize Unicode characters of the input term to pure ASCII with synonym option.
- Full Description:
- Difference:
- Features:
- Get Unicode synonyms
- Unicode core norm, recursively perform:
- Map Unicode symbols and punctuation to ASCII
- Map Unicode to ASCII
- Split ligatures
- Strip diacritics
- Get Unicode symbol name if the character is not ASCII
- Symbol: q6
- Examples:
This flow normalizes characters of the input term to pure ASCII with synonym options. That is to utilize get Unicode synonyms, Unicode core norm, and then get Unicode symbol names for characters are not ASCII. This flow is equivalent to the combined flow options -f:q4:q7:q3. Please refer to the design documents of Normalize Unicode characters to ASCII with synonym option for details.
No effect on the -m option. "none" is added at the end of the output.
Utilize the recursive algorithm of Unicode core norm (-f:q7) instead of using combined flows of striping diacritics (-f:q) and splitting ligatures (-f:q2) from previous versions.
Normalize Unicode characters of the input term to pure ASCII with synonym option:
shell> lvg -f:q6 Østland Østland|Ostland|2047|16777215|q6|1| Déjà ©1999 Déjà ©1999|Deja ![COPYRIGHT SIGN]!1999|2047|16777215|q6|1| μ μ|![MICRO SIGN]!|2047|16777215|q6|1|More examples
- Get Unicode synonym
- Utilize Unicode core norm
- Get Unicode symbol name if the character is not ASCII