In this section, we try to improve the precision on the WSD test by improving the algorithm on StWsd. The approach and testing results are discussed as follows:
I. Test Suite Setup
In the previous section, we found a good set of St-Documents by applying weighted frequency, prioritizing ST-Groups, and STRI filter rules. In this test, we used the best 3 sets of St-Documents as the testing data. These 3 sets of St-Documents are:
II. Approach
Target Sentence | Entire Citation | |||
---|---|---|---|---|
St-Document\Score | DC Original Input | DC Ambiguous Sentences | DC Original Input | DC Ambiguous Sentences |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 1 | 78.60% | 78.60% | 78.06% | 78.84% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2 | 78.60% | 78.60% | 78.17% | 78.59% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 3 | 78.71% | 78.71% | 78.08% | 78.50% |
Avg. precision of above three St-Documents | 78.64% | 78.64% | 78.10% | 78.64% |
Target Sentence | Entire Citation | Ambiguous Sentences (Entire Citation) | ||||
---|---|---|---|---|---|---|
St-Document\Score | STI-DC | STRI-DC | STI-DC | STRI-DC | STI-DC | STRI-DC |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 1 | 78.60% | 77.92% | 78.06% | 77.38% | 78.84% | 77.99% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2 | 78.60% | 77.78% | 78.17% | 77.74% | 78.59% | 77.95% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 3 | 78.71% | 77.89% | 78.08% | 77.54% | 78.50% | 77.89% |
Avg. precision of above three St-Documents | 78.64% | 77.86% | 78.10% | 77.55% | 78.64% | 77.94% |
Target Sentence | Entire Citation | Ambiguous Sentences (Entire Citation) | ||||
---|---|---|---|---|---|---|
St-Document\Score | DC | WC | DC | WC | DC | WC |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 1 | 78.60% | 78.65% | 78.06% | 78.58% | 78.84% | 78.89% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2 | 78.60% | 78.78% | 78.17% | 78.44% | 78.59% | 78.86% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 3 | 78.71% | 78.88% | 78.08% | 78.50% | 78.50% | 79.05% |
Avg. precision of above three St-Documents | 78.64% | 78.77% | 78.10% | 78.51% | 78.64% | 78.93% |
From this observation, we derive an algorithm for a new score system, ES, as follows:
Target Sentence | Entire Citation | Ambiguous Sentences (Entire Citation) | |||||||
---|---|---|---|---|---|---|---|---|---|
St-Document\Score | DC | WC | ES | DC | WC | ES | DC | WC | ES |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 1 | 78.60% | 78.65% | 79.06% | 78.06% | 78.58% | 78.32% | 78.84% | 78.89% | 78.71% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 2 | 78.60% | 78.78% | 79.08% | 78.17% | 78.44% | 78.50% | 78.59% | 78.86% | 78.85% |
frequency, 1StGroup: StdDev & Top 15; mStGroups: Top 3 | 78.71% | 78.88% | 79.05% | 78.08% | 78.50% | 78.22% | 78.50% | 79.05% | 78.82% |
Avg. precision of above three St-Documents | 78.64% | 78.77% | 79.06% | 78.10% | 78.51% | 78.35% | 78.64% | 78.93% | 78.79% |
III. Conclusion
We conclude that to get the best precision through StWsd, users should use: