What is new
The Text Categorization tool 2009 version is the 3rd offical public rlease.It was developed in pure Java, capable of handling UTF-8. Belows are some specifications of this tool.
System
- Upgrade to Java 1.6.0.13
- Upgrade to HSqlDb 1.8.0.10
- Provides scripts for command line tools
Data
- Used MEDLINE.2009 for citations created in years of 2006, 2007, 2008
- Used Metathesaurus.2008AB
- Used lsi2009.xml
- Used new designed algorithm to generate optimum stDocuments
- Used the latest data set for JDI, STI, and STRI
- Updated the default value of Mac. normalized count
- Provides completed data set of tcData.2007 and tcData.2008 to run on TC.2009
Java APIs/Tools/Classes
- New Tool StWsd
- New Java APIs of StWsd
Tool Options
- New options to run specified data set, TC.2007 or TC.2008 in Jdi, Sti, Stri, and StWsd
- New options of specifying score types in StWsd
- New Options of specifying STI or STRI in StWsd
- New Options of using ambiguous sentences in StWsd
- New Options of show details of ST scores in StWsd