Using Other Database - MySql

This page provides details of how to integrate and configure lexical tools with a pre-existing MySql database. An open source database, MySql, is illustrated as an example. This page can also be used as general guideline for using other database. A step by step guideline is detailed as bellows. Users may skip this page if they use the installation script to install their Lexical tools with HSqlDb.

  1. Install Lexical Tools
  2. Database Setup
  3. Load data
  4. Configure in Lexical Tools
  5. Issues


1. Install Lexical Tools

Follow the instructions on Lexical Tools Installation Instructions to complete the installation of lexical tools.

Installation

2. Database Setup

2.1 Requirements

A running database with following specifications is needed to run lexical tools with this option. Please consult to your database administrator if you have questions.

  • A existing database with enough space
  • Root account and password of database
  • Hostname and port number of database
  • JDBC driver for the database

2.2 Procedures

  • Create a user account for lvg
  • Create a database for lvg2012

Consult with your DBA if you have any questions

2.3 Example: MySql

MySql is one of the most popular open source databases. We use it as an example for this option. Bellows are required information:

  • DB: MySql V.4.1.12
    Please refer to the MySql web site to download software and follow the documents to install MySql
  • JDBC: Download mysql-connector-java-3.1.10-bin.jar from web and put it to ${LVG_DIR}/lib/jdbcDrivers/MySql/
  • root account/password: root/root_password
  • Hostname/port number: localhost/3306(default)

We also create the user account and database for lvg:

Create a user account:

  • username: lvg
  • password: lvg

Create a database:

  • database name: lvg2012

2.4 Sample Script:

You might need your DBA to help you on setting up user account and database for lvg. However, there is a shell script provided with lvg package to ease this step.

  • Edit file ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lexAccess/loadDb/MySql/db.cfg and modify values of
    • DBA_USERNAME
    • DBA_PASSWORD
    • LVG_DIR
  • Edit file ${LVG_DIR}/loadDb/bin/1.SetupMySql and modify values of
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell>1.SetupMySql <year>

Up to this point, users should have installed the MySql database, created a database in it called lvg2012, and opened an account with lvg as both username and password.

3. Load Data

3.1 DB tables and data files

There are 9 major tables to be created and loaded into database. Each of the tables below needs to be created and loaded. The directory for these data files is: ${LVG_DIR}/data/tables/*

Table NameSource Data FileTotal RecordsCreate and Load TableCheck data
Inflection${LVG_DIR}/data/tables/inflection.data1,620,148LoadInflectionCheckInflection
Synonym${LVG_DIR}/data/tables/synonyms.data5,196LoadSynonymCheckSynonym
Acronym${LVG_DIR}/data/tables/acronym.data69,384LoadAcronymCheckAcronym
ProperNoun${LVG_DIR}/data/tables/properNoun.data13,867LoadProperNounCheckProperNoun
Derivation${LVG_DIR}/data/tables/derivation.data89,950LoadDerivationCheckDerivation
Nominalization${LVG_DIR}/data/tables/nominalization.data14,445LoadNominalizationCheckNominalization
Canonical${LVG_DIR}/data/tables/canonical.data1,395,720LoadCanonCheckCanon
Fruitful${LVG_DIR}/data/tables/fruitful.data7,805,672LoadFruitfulCheckFruitful
AntiNorm${LVG_DIR}/data/tables/antiNorm.data2,079,435LoadAntiNormCheckAntiNorm

Please notes that the last two columns are Java programs used to create DB tables, load data into database, and check loaded table. User may find source codes for loading and checking data in MySql are under ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql. These Java source codes are derived from what lvg used for HSqlDb with minor modifications. Java programs of HSqlDB can be found under ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/HSqlDb. Users may write their own code for different database by modifying Java codes from either of above two directories.
Both ${LVG_DIR}/loadDb/lib/lvgLoadDb.jar and ${LVG_DIR}/lib/lvg2012dist.jar contain all Java classes for these files.

3.2 Procedures

  • Download JDBC driver and put it to ${LVG_DIR}/lib/jdbcDrivers/NameOfDatabase

  • Create tables and load data:
    As mentioned above, users need to create 9 Db tables and load data into these tables. Please refer to:
    • Lexical Tools database pages for details on all DB tables
    • ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql for Java codes of creating Db tables and loading data

  • Verify database (optional):
    It is a good idea for users to verify database after data are loaded into these DB tables.

3.3 Example: MySql

This section describes procedures of loading data into 9 DB tables in MySql database as follows:

  • Download MySql JDBC driver: mysql-connector-java-3.1.10-bin.jar and put it to ${LVG_DIR}/lib/jdbcDrivers/MySql
  • Run scripts, LoadXxx, to
    • Delete table Xxx if the table already existed in the database
    • Create a new table named Xxx in the database
    • Load data from a flat file (Xxx.data) into table Xxx
    • Checks the total number of inserted records
  • Run scripts, CheckXxx, to
    • Verify database tables (optional)

This paragraph describes steps to run java programs of LoadXxx and CheckXxx.

  • Set all variables in the database config file (${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql/db.cfg).
  • Add ${LVG_DIR}/loadDb/lib/lvgLoadDb.jar to classpath.
  • Add ${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.10-bin.jar to classpath.
  • cd ${LVG_DIR}/loadDb/sources/gov/nih/nlm/nls/lvg/loadDb/MySql/
  • java -classpath $CLASSPATH gov.nih.nlm.nls.lvg.loadDb.MySql.LoadXXX

Users may run CheckXxx (similar setup steps as above) to check and perform queries to table Xxx to make sure all tables are created and loaded correctly.

3.4 Sample Scripts

There are two scripts are provided to ease the processes of loading and checking data in the database:

  • Edit file ${LVG_DIR}/loadDb/bin/2.LoadDb and modify values of
    • LVG_DIR
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell> 2.LoadDb <year>
    2
    3
    

  • Edit file ${LVG_DIR}/loadDb/bin/3.TestDb and modify values of
    • LVG_DIR
    • JDBC
  • shell>cd ${LVG_DIR}/loadDb/bin/
  • shell> 3.TestDb
    
    

4. Configure Lexical Tools

4.1 Configuration file

The default configuration file is at ${LVG_DIR}/data/config/lvg.properties. You will need to modify the configuration file to use

  • different database
  • different database port number
  • different JDBC connector
Please refer to Configuration setup page for details.

4.2 JDBC Driver

Add your JDBC driver to CLASSPATH

4.3 Example: MySql

Use MySql as example, there are two way to set up in the configuration, as described bellows:

  1. Method 1:
    • DB_TYPE=MYSQL
    • DB_DRIVER=com.mysql.jdbc.Driver
    • DB_NAME=lvg2012
    • DB_HOST=localhost
    • DB_USERNAME=lvg
    • DB_PASSWORD=lvg

    or

  2. Method 2:
    • DB_TYPE=OTHER
    • DB_DRIVER=com.mysql.jdbc.Driver
    • JDBC_URL=jdbc:mysql://localhost/lvg2012
    • DB_USERNAME=lvg
    • DB_PASSWORD=lvg

In addition, users need to add JDBC driver to classpath. For examples, add ${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.8-bin.jar to ${LVG_DIR}/bin/norm if you want to run norm.

Now, Lexical Tools are ready to run with MySql database

4.4 JDBC & ports:

  • Use different port number in MySql:
    The default port of MySql Db is 3306. In case that different port number ( for example, 3307) is used in your MySql database, you will need to add the port number after the database host (localhost:3307).

  • Use different JDBC connector:
    Users may use any workable JDBC driver for their database. Bellows are the quick summary of using different JDBC connector.
    • Download JDBC jar file (mysql-connector-java-3.1.10-bin.jar)
    • Add the JDBC jar file in the CLASSPATH (${LVG_DIR}/lib/jdbcDrivers/MySql/mysql-connector-java-3.1.10-bin.jar)
    • Change the Driver class name (com.mysql.jdbc.Driver) in the lvg configuration file.

5. Issues

  • MySql & UTF-8
    In our test, MySql 4.1 along with MySql JDBC 3.1.8 are not 100% compatible with UTF-8. Some problems caused by prepareStatement and table creation. For the same reason, we expect similar problems might happen in other database server. Accordingly, we suggest users use Lexical tools with its default DB, HyperSonic DB.

  • MySql in PC/Win platform
    In past experience, it took more than 20 hours to load lvg data into MySql DB in PC-win platform (4.1). It only takes less than 30 min. to load lvg data into MySql DB in Linux and Unix. However, the performance on lvg operation speed is at the same level of magnitude.