Pre-Process: Jid-Ta-Jds
- Description:
This file includes the information of Journal Id (JID), Journal title (TA), and the associated Journal Descriptors (JDs) from List of Serials Indexed file lsi${YEAR}.xml. It was originally manully maintained by NLM and Susanne in 2004 training set. It was static and provided by Susanne as "jid-ta-jd.im.20031201.mod.fixed.l".
In the Java 2007 release, we derived this file from List of Serials Indexed file, lsi2006.xml. We use lsi2007.xml for the 2008 release.
- Input:
- Java File & Algorithm:
- GenerateJidTaJdsFromLsi.java
- parse lsi.xml file
- Find xml tag <NlmUniqueID> for Journal ID, JID
- Find xml tag <MedlineTA> for Journal Title, TA
- Find xml tag <BroadJournalHeading> for Journal Descriptors, JDs
- Find xml tag <BroadJournalHeadingList> for the begining of JDs
- print out information in the new format to file: jidTaJds.out
- perform unique sort on jidTaJds.out to get jidTaJds.txt
(sort -u jidTaJds.out > jidTaJds.txt)
- Output File:
- Notes: