Sti Java

Introduction

Sti tool uses the Jdi methodology to calculate word-St scores. The calculation of word-St scores uses words in the MEDLINE training set and ST (semantic type) documents; an stDocument is a set of one-word UMLS Metathesaurus strings belonging to an ST. The word-St scores for a word have been calculated by comparing the JDI of the word and the JDI of each ST document. The pre-calculated word-St scores are loaded into a database. Sti takes the inputs, which are text phrases, and applies filters such as word extraction algorithms, stopwords, minimum word length, etc. Then, Sti calculates the average word-St scores for all inputs, and sends the ranked STs with their scores to the output.

SetUp

Follow the installation instructions to install text categorization tools and run the sti program. Check on the following items only if you don't use the provided script to install Text Categorization tools.

TestRun

Input

Sti take text as input:

Output

Sti calculates the average ST scores of the input text for both word counts and document counts and sent the top rank ST to output. If detail flag, -d, is used, the results include rank, ST scores in following format:

RankST ScoresST abbreviationST name

Sti Options

Please refer to design document