SPT - Synonyms Source Design

I. Introduction
Synonyms are read in from file(s) and kept in the following table:

  • Vector<SynonymObj> to keep all synonym information
  • Hashtable<String, Vector<Integer>> to keep the indexes of each word

Theoretically, all synonyms are symmetric (bi-directional), so only one (direction) synonym is needed to read in from users. The system will automatically generate the symmetric synonym. In addition, duplicated synonyms are ignored. This information is stored in a Vector of Synonym pairs as a result of:

  • Even index: the original synonym
  • Odd index: the symmetrical synonym

II. Algorithm

  • Read in each line (synonym pair: word=synonym) form synonym file
  • Check if the synonym pair does not exist in synonymIndex_
    • check if word exists in synonymIndex_
      • if not=> does not exist
      • if yes, check if synonym exists in synonymList_ from the index of word in synonymIndex_
        • if not => does not exist
        • if yes => exist
  • if the synonym pair does not exist in synonymIndex_
    • Add word|synonym to synonymList_
    • Add synonym|word to synonymList_

    • Update index of word to synonymIndex_
    • Update index of synonym to synonymIndex_

III. Java Classes

  • Synonym.java: a Java object class for synonym pair: word|synonym
  • Synonyms.java: a java class to load synonyms from file(s) and keep them in synonymList_ and synonymIndex_.

IV. Example

Inputs:

  • dog=canine
  • cat=feline
  • canine=mutt

Vector<Synonym> synonymList_

indexSynonym
wordsynonym
0dogcanine
1caninedog
2catfeline
3felinecat
4caninemutt
5muttcanine

Hashtable<String, Vector<Integer>> synonymIndex_

keyValues
wordVector<Integer>
dog[0]
canine[1, 4]
cat[2]
feline[3]
mutt[5]