Pre-Process: Jid-Ta-Jds
- Description:
This file includes the information of Journal Id (JID), Journal title (TA), and the associated Journal Descriptors (JDs) from List of Serials Indexed file lsi${YEAR}.xml. It was originally manually maintained by NLM and Susanne in 2004 training set. It was static and provided by Susanne as "jid-ta-jd.im.20031201.mod.fixed.l". In the Java 2007 release, we derived this file from List of Serials Indexed file, lsi2006.xml. We use lsi2007.xml for the 2008 release. - Input:
- By NLM:
- ftp://ftp.nlm.nih.gov/online/journals/lsi2007.xml
- By NLM:
- Java File & Algorithm:
- GenerateJidTaJdsFromLsi.java
- parse lsi.xml file
- Find xml tag <NlmUniqueID> for Journal ID, JID
- Find xml tag <MedlineTA> for Journal Title, TA
- Find xml tag <BroadJournalHeading> for Journal Descriptors, JDs
- Find xml tag <BroadJournalHeadingList> for the begining of JDs
- print out information in the new format to file: jidTaJds.out
- perform unique sort on jidTaJds.out to get jidTaJds.txt (sort -u jidTaJds.out > jidTaJds.txt)
- GenerateJidTaJdsFromLsi.java
- Output File:
- jidTaJds.txt, used in TC.MLT
JID TA JD 1 JD 2 ...
- jidTaJds.txt, used in TC.MLT
- Notes:
- None