What is new?

The Text Categorization tool 2011 version is the 5th official public release. It was developed in pure Java, capable of handling UTF-8. Bellows are some specifications of this tool.


  • Upgrade to Java
  • Upgrade to HSqlDb 2.0.0
  • Provides scripts for command line tools


  • Used MEDLINE.2011 for citations created in years of 2008, 2009, 2010
  • Used Metathesaurus.2010AB
  • Used lsi2011.xml
  • Used the latest data set for JDI, STI, and STRI
  • Updated the default value of Mac. normalized count
  • Compatible to run with data set of:
    • tcData.2010
    • tcData.2009
    • tcData.2008
    • tcData.2007


  • Add new features in StWsd to take ST abbreviations and TUI as St candidates