Word Tokenizer Rules (requirements)

Word Tokenizer is used to tokenize and filter out words and characters in TI and AB fields from citations. The following rules/requirements are captured from the training set of 2004 data.