Standardization on Derivational Pairs

All dPairs in Lexical Tools are two directional without consider which is the source. In other words, if D-1 is a derivation of D-2, then D-2 must be a derivation of D-1, where D-1 is derivation 1 and D-2 is derivation 2. Please note that both D-1 and D-2 must be a base (uninflected) form. From this, we know the following two dPairs are identical:

  • D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2
  • D-2|CAT-2|EUE-2|D-1|CAT-1|EUI-1
The above two dPairs might exist in the derivation file (database table) and result in duplication on derivation generation. A standardization form is defined as follows to resolve this issue.
  • zeroD & suffixD:
    A standardized dPair D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2 is defined as:
    • D-1 < D-2 ,alphabetically
    • CAT-1 < CAT-2 ,alphabetically (if D-1 == D-2)
    • EUI-1 < EUI-2 , alphabetically (if D-1 == D-2 and CAT-1 == CAT-2)
  • prefixD:
    A standardized dPair D-1|CAT-1|EUE-1|D-2|CAT-2|EUI-2 is defined as:
    • D-1 = prefix + base
    • D-2 = base

Bellows are some notes on the process regarding to this issue:

  • Standardization is implemented when the raw dPair file is generated (std-raw)
  • The manual tagged dPair file could be either direction (tag)
  • All tagged dPairs are standardized when added to meta file (meta)
  • All result dPairs are standardized (*.${YEAR})
  • Two directional derivations are generated in Lexical Tools database retrieval