WSD Test Results

In this section, we try to improve the precision on the WSD test by improving the algorithm on StWsd. The approach and testing results are discussed as follows:

I. Test Suite Setup
In the previous section, we found a good set of St-Documents by applying weighted frequency, prioritizing ST-Groups, and STRI filter rules. In this test, we used the best 3 sets of St-Documents as the testing data. These 3 sets of St-Documents are:

  • frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 1
  • frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2
  • frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 3

The setup of test suite is same as before with additional options:
  • Added options of using Ambiguous sentences
  • Added options of using WC score
  • Added options of using WS score

II. Approach

  • Ambiguous Sentences Option
    First, we tried the option of ambiguous sentences for the WSD test. As expected, the precision of target sentence are the same with or without ambiguous sentences option. The reason is because a target sentence is:
    • one sentence
    • contains at least one ambiguous variants

    Accordingly, the input text does not changed when applying this ambiguous sentences option on the target sentence. This leads to the same precision results on WSD test. However, the average precisions of WSD test on entire citation have been improved from 78.10% to 78.64% when ambiguous sentences option is applied. We also observed that the average precision for both target sentence and entire citation are 78.64%. However, the best precision reaches to 78.84% when use ambiguous sentences option on the entire citation. From this observation, we conclude:
    • The precision does not improve when the input contains more sentences average wise (both average precisions are 78.64%). In other words, only the target sentence should be used as the input for WSD applications.
    • A higher precision (78.84%) can be reached when more sentences are used. This depends on the inputs (testing data).
    • Ambiguous sentences option should be used to improve the precision when the input is the entire citation (paragraph/multiple sentences).

     Target SentenceEntire Citation
    St-Document\ScoreDC
    Original Input
    DC
    Ambiguous Sentences
    DC
    Original Input
    DC
    Ambiguous Sentences
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 178.60%78.60%78.06%78.84%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 278.60%78.60%78.17%78.59%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 378.71%78.71%78.08%78.50%
    Avg. precision of above three St-Documents78.64%78.64%78.10%78.64%

  • STI or STRI
    As discussed before, both STI and STRI can be used to StWsd. Which is better? We run some tests on both and found that STI always perform better with higher precisions. The testing results are shown in the following table:

     Target SentenceEntire CitationAmbiguous Sentences
    (Entire Citation)
    St-Document\ScoreSTI-DCSTRI-DCSTI-DCSTRI-DCSTI-DCSTRI-DC
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 178.60%77.92%78.06%77.38%78.84%77.99%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 278.60%77.78%78.17%77.74%78.59%77.95%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 378.71%77.89%78.08%77.54%78.50%77.89%
    Avg. precision of above three St-Documents78.64%77.86%78.10%77.55%78.64%77.94%

    We also run many other tests between STI and STRI with different St-documents and score systems, all results show STI is a better method than STRI for WSD applications.

  • DC or WC
    The other issue we need to address is which score system should we use? Document Count (DC) score or Word Count (WC) score? Again, we run the WSD tests with STI for the best 3 St-Documents. The results, as shown in the following table, show:
    • WC seems to have better precision than DC
    • A best precision (79.05%) is reached in the case of using ambiguous sentences option on entire citation with WC score.

     Target SentenceEntire CitationAmbiguous Sentences
    (Entire Citation)
    St-Document\ScoreDCWCDCWCDCWC
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 178.60%78.65%78.06%78.58%78.84%78.89%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 278.60%78.78%78.17%78.44%78.59%78.86%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 378.71%78.88%78.08%78.50%78.50%79.05%
    Avg. precision of above three St-Documents78.64%78.77%78.10%78.51%78.64%78.93%

  • Expert Score (ES) System
    From above observation, should we conclude we should use WC (over DC) all the time to get better precision. We tried to look into this question and below are our observations:
    • The average score of DC and WC are all above 78% and below 79%. In other words, they are all good with very limited difference.
    • We checked into each instance of each ambiguous word and found that WC and DC find the same ST (sense) for the most of the time. This is why their precision are high and very close.
    • For instances that WC and DC pick different ST (sense), usually, the correct answer correspond to the one with high relative score. A relative score is defined as the difference between the score of St-Candidates. Please notes that both WC and DC score are a number between 0.0 and 1.0 (a result from cosine coefficient).

    From this observation, we derive an algorithm for a new score system, ES, as follows:

    • Use the best St-candidate when DC and WC pick the same best St-Candidate
    • Use the St with higher relative score when DC and WC pick different best St-Candidate

    We tested the ES score on the best 3 sets of St-Documents and found:
    • We reached a best precision at 79.08% on ES score with target sentence with St-Documents: frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2
    • The average precision (79.06%) of ES on target sentence is the best
    • The precision with ambiguous sentences option (78.79%) is higher than without (78.35%) for ES when the input is a paragraph (entire citation)

     Target SentenceEntire CitationAmbiguous Sentences
    (Entire Citation)
    St-Document\ScoreDCWCESDCWCESDCWCES
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 178.60%78.65%79.06%78.06%78.58%78.32%78.84%78.89%78.71%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 278.60%78.78%79.08%78.17%78.44%78.50%78.59%78.86%78.85%
    frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 378.71%78.88%79.05%78.08%78.50%78.22%78.50%79.05%78.82%
    Avg. precision of above three St-Documents78.64%78.77%79.06%78.10%78.51%78.35%78.64%78.93%78.79%

III. Conclusion

We conclude that to get the best precision through StWsd, users should use:

  • Target sentence with ES score option
  • Use ambiguous sentences option when the input is a paragraph. Both WC and WS score are good choice.
  • Use STI instead of STRI