The MEDLINE N-gram Set
The MEDLINE n-gram set includes n-grams (n = 1 ~ 5) retrieved from the annual MEDLINE/PubMed Baseline. These data include n-grams from MEDLINE titles and abstracts and their associated word counts. In addition, a distilled n-gram set that filters out invalid words/multiwords is also included. The MEDLINE N-gram Set is freely available subject to these terms and conditions.