UMLS-Core: Process

I. SMT Package files
Use smt to get the result of CUI mapping. Need to make sure the files are updated, see STMT preProcess for details:

  • Metathesaurus
    • nonSuppressCui.data (for nonSuprees|CUI mapping)
    • normTermCui.data (for normTerm|CUI mappig)
    • preferredTerms.data (for preferred term mapping)
  • UMLS-Core Synonym files
    • normTermSynonyms.data (corpus)

II. Input file

  • Convert the original input file to
    • Location: ${UMLS_CORE_DIR}/data/${DATA_YEAR}/${DATA_NAME}/inputs/inTerms.data
    • Format:
      Field 1Field 2
      Term IDTerm

    • Methods:
      • Convert the file as pipe ("|") separated field if the original file is *.cvs
        shell> cd ${UMLS_CORE_DIR}/bin
        shell> ConvertCsvToPipe
        ...
      • Retrieve the term ID and term
        flds inTerm.allFields > inTerm.data.1.4
      • Manually remove the banner (first line) of the file
        inTerm.data

III. Process: Run UMLS-Core CUI mapping
shell> cd ${UMLS_CORE_DIR}/bin
shell> UmlsCore
...

IV. Output files

  • Location: ${UMLS_CORE_DIR}/data/${DATA_YEAR}/${DATA_NAME}/outputs/
    • log.${DATA_NAME}
      Detail log and statistic for all synonym substitution mapping
    • out.${DATA_NAME}
      Final results in following format
      Term IDTermNorm TermCUIPreferred TermSynonym substitution flag