Spelling Variants Model
Introduction
From the results of non-lead-end words filters, we observed that if a term has associated spelling variant(s) exist in the same corpus (The MEDLINE n-Gram set), it is likely a valid multiword. Thus, we developed algorithms of spelling variants pattern to identify spelling variants group from n-Grams. This can be used as matchers to retrieve valid multiwords from n-Grams.
Definiton and types
By definition, spVars are terms that have same meaning, categories (POS), pronuncitaion, syntax, and different spelling. They are the different spelling from American and British English.
SpVar Pattern Algorithm
SpVar Pattern Test
- SpVar Test on LRSPL (Lexicon SpVar)
- SpVar Test on inflVars (Lexicon)
- GoldStd from Lexicon
- For Amia.2016 Initial submission
- For Amia.2016 Final submission
- Phonetic algorithm tests
- GoldStd from Lexicon
- Remove parent terms (TBD)
Applications