mlt
Mlt tool is designed to tokenize fields from MEDLINE citations. Title, Abstract, and MH (starred MHs and SHs only) fields, and combinations of these are routinely tokenized and extracted from a MEDLINE citation. Other fields may be specified for tokenization as well.
Follow the installation instructions to install text categorization tools and run the mlt program. Check on the following items only if you don't use the provided script to install Text Categorization tools.
- CLASSPATH:
- include the Text Categorization tools distribution jar file, ${TC_DIR}/lib/tc2011dist.jar, in your CLASSPATH.
- include the TC top directory in your CLASSPATH.
- Configuration File: assign the full path of the top directory of tc2011 to a variable named ROOT_DIR in the configuration file, data/Config/tc.properties.
- Run java program
Enter the command:
> mlt -h Synopsis: mlt [options] Description: mlt is a program to tokenize MEDLINE citations by specifying field tags Options: -ci Show configuration information -h Print program help information (this is it) -i:STR Specify input file (must specify) -pmid Preserve PMID in the first field -s Sort output by PMID -o:STR Specify output file (must specify) -t:STR Specify MEDLINE field tag:TI|AB|TIAB|MHs|TIABMHs|ALL (must specify) -v Print the current version of mlt -x:STR Specify an alternate configuration file
where:
- mlt: Mlt script to run Mlt Java class
- -h: set Mlt system option to display help information
Three inputs must be specified when run mlt:
- input File: MEDLINE citations
- output File: where the results go
- field tag:
- TI: title
- AB: abstract
- TIAB: title and abstract
- MHs: starred MeSHs
- TIABMHs: title, abstract, and starred MeSHs
- ALL: all fields (the output should be the same as the input)
- Any legal tag in MEDLINE citations
Each field will be sent to output and separated by line separator.
Please refer to design document