BulkLoader Class

DEPRECATED

com.bea.p13n.content.document.ref.loader
BulkLoader Class

public class BulkLoader

    extends Object
    implements FilenameFilter

The reference document repository bulk loader application.

This class is mainly designed to run as a command-line application, via a "java com.bea.p13n.content.document.loader.BulkLoader" command-line. To see a usage, give it a -h flag or read the Usage.txt in this package.

Additionally, BulkLoader objects can be created and used to provide the functionality in other places. The lifecycle of a BulkLoader is as follows:

Not calling parseArgs(), validateArgs(), and initialize(), in that order, will cause the BulkLoader object to most likely ungracefully fail. However, once those methods have been invoked, the utility methods (like go(), doLoad(), printSchema(), deleteDoc(), and insertDoc()) can invoked as needed. The commit() and rollback() method will control the BulkLoader's transaction with the database. When finished, be sure to call the shutdown() method to release internal resources.

If manually constructing and utilizing a BulkLoader object, be certain to synchronize all access to the object. Since the command-line program is single-threaded, BulkLoader objects are not thread-safe by design.

To load the default LoaderFilter, the BulkLoader looks for the com/bea/p13n/content/document/loader/loader.properties file in the CLASSPATH. From that it reads the list of default LoaderFilter class names from the "loader.defFilters" property. To not use any of the default filters, specify +filters in the command-line args.

Related Topics

MetaParser
FileCache


Hierarchy
Object
  BulkLoader
All Implemented Interfaces

FilenameFilter

Nested Class Summary

public static classcom.bea.p13n.content.document.ref.loader.BulkLoader.ShowUsageException
           Quick inner exception thrown on parseArgs() to say we should just print a usage report.

Field Summary

public Map
addlColumnMap
The map of additional DOCUMENT table column name to collection of property names that map onto the column name.
public Collection
addlColumnNames
The list of additional DOCUMENT table column names.
public int
commitAfter
The number of documents loaded after which this should commit (0 or less for only at the end).
protected Connection
connection
The connection this loader uses.
public String
conPoolName
The connection pool name.
public static final String
DEF_MD_FILE_EXT
The default file extension for metadata property files.
public static final String
DEF_MIME_TYPE
The default mime type.
public static final String
DEF_SCHEMA_NAME
The default schema name.
public static final String
DEF_SCHEMA_PATH
The default path for the schema file.
public static final String
DEF_WLS_PROPS_PATH
The default weblogic properties file path.
public static final String
deleteDocSql
The preparable sql to remove a document from the database.
protected PreparedStatement
deleteDocStmt
The delete document statement.
public static final String
deleteMDSql
The preparable sql to remove a document's implicit metadata from the database.
protected PreparedStatement
deleteMDStmt
The delete metadata statement.
public static String
DOC_MD_TABLE
The document_metadata table name.
public static String
DOC_TABLE
The document table name.
public String
docBase
The docBase.
public boolean
doCleanUp
Are we supposed to do a cleanup.
public boolean
doDelete
Are we deleting entries.
public boolean
doMetaParse
Are we supposed to parse '*.htm' and '*.html' files for META tags.
public static final Collection
explicitAttrNames
The set of explicit document metadata attribute names.
public static final Collection
explicitColumnNames
The default set of DOCUMENT table column names.
public String
fileEncoding
The file enconding (null for VM default).
public List
fileList
The list of files/directories to scan over.
public List
htmlMatchList
The list of patterns that represent HTML file names.
public boolean
ignoreErrors
Do we ignore errors.
public List
ignoreList
The list of file name patterns to ignore.
public boolean
includeHidden
Are we supposed to include hidden files and directories.
public boolean
inheritProps
Are we supposed to inherit metadata properties when recursing directories?
public static final String
insertDocSql
The default preparable sql to insert a document into the database.
protected PreparedStatement
insertDocStmt
The insert document statement.
public static final String
insertMDSql
The preparable sql to insert a document metadata into the database.
protected PreparedStatement
insertMDStmt
The insert document metadata statement.
protected Properties
jdbcProps
The JDBC connection properties.
protected String
jdbcUrl
The JDBC connection url.
public List
loaderFilters
The list of LoaderFilters to try.
public List
matchList
The list of file name patterns to include.
public String
mdFileExt
The file extension of metadata property files.
protected Collection
metadataNames
The metadata properties we find along the way.
protected String
myInsertDocSql
The SQL statement used to insert into the DOCUMENT table.
protected String
myUpdateDocSql
The SQL statement used to update the DOCUMENT table.
protected long
numDocsLoaded
The number of documents we've loaded so far.
public boolean
recurse
Do we recurse over directories?
public String
schemaName
The name of the schema in the schema file.
public boolean
schemaOnly
Are we doing only the schema file generation.
public String
schemaPath
The path to the schema file to output.
public boolean
testOnly
Are we running in test mode?
public boolean
truncate
Do we try to truncate.
public static final String
updateDocSql
The default preparable sql to update a document in the database.
protected PreparedStatement
updateDocStmt
The update document statement.
public boolean
verbose
Do we spew out messages.
public String
wlsPropsPath
The weblogic.properties file path.
 

Constructor Summary

BulkLoader()

Constructor a BulkLoader without command-line arguments.
BulkLoader(String args)

Construct a BulkLoader from the given command-line arguments.
 

Method Summary

public boolean
accept(File dir, String name)
Implement the FilenameFilter interface method to use our match and ignore lists.
public static void
close(Object o)
Close an object which has a close() method, ignoring any exceptions.
public void
commit()
Commit the transaction.
public void
debug(String mesg)
Out put a debug message.
public int
deleteDoc(String path)
This will remove the document with id of path from the database, including all of its implicit metadata.
public void
deleteDocMetadata(String path)
This will remove the implicit metadata for the specified document.
public void
doLoad()
Do the actual bulk load logic on the file list.
public void
doLoad(File baseDir, String path, Properties dirProps)
Load the given path into the database.
public void
error(String mesg, Throwable ex)
Output an error message.
public void
error(String mesg)
Output an error message.
protected void
finalize()
Called when this to be finialized.
public String
fixPath(String path)
Fix up a path to be forward-slash style and to not have empty path parts.
public String
fixString(String in, String tableName, String colName)
Fix up strings to check potentionally truncate too large values.
public static String
fixString(String in)
Fix empty strings to be nulls.
public Properties
getLoaderFilterProperties(File f, Properties p)
Get the properties from the BulkLoader's LoaderFilters for the given file.
public Properties
getMetadataProperties(File base, Properties p)
Get the metadata properties for the given file or directory.
public String
getPropertyForColumn(String colName, Properties p)
Get the property value for the specified additional column name.
public void
go()
Let the bulkloader go.
public void
initialize()
Initialize the bulkloader from the current state.
public int
insertDoc(String path, File f, Properties p, String mimeType)
Update or insert a document and metadata into the database.
public static boolean
isHidden(File f)
Check if the specified file is a hidden file.
public boolean
isHtmlFile(String name)
Tell if the specified file name is an HTML file to the loader.
public static boolean
isReadableDirectory(String name)
Check if the specified file name is a directory that we can get into.
public void
loadPropertyColumnInfo(Properties p)
Load the set of jdbc.column.
public static int
main(BulkLoader loader, String[] args)
The main method invoked on a BulkLoader instance.
public static void
main(String[] args)
Command-line entry point.
public void
parseArgs(String[] args)
Parse the given input arguments.
public void
printSchema()
Print the schema xml for all the metadata we've loaded so far.
public void
printSchema(PrintWriter out, String enc)
Print the schema xml for all the metadata we've loaded so far to the given output stream.
public void
rollback()
Rollback the transaction.
protected void
setJDBCInfo()
Set the jdbc connection url and properties from the current weblogic.properties file.
public boolean
shouldIgnore(String name)
Tell if the loader should ignore the specified file name.
public boolean
shouldInclude(String name)
Tell if the loader should include the specified file name.
public void
shutdown()
Shutdown this bulk loader.
public static List
split(String str, String on)
Split a delimited list into a List.
public static Properties
splitToProperties(String str, String on)
Split a WLS connection pool style string in a Properties object of the name=value pairs.
public void
usage(PrintStream out)
Print the usage of the application.
public void
usage(PrintWriter out)
Print the usage of the application.
public void
validateArgs()
Validate that we have been passed correct arguments.
public void
warning(String mesg, Throwable ex)
Output a warning message.
public void
warning(String mesg)
Output a warning message.
 
Methods from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
   
Methods from interface java.io.FilenameFilter
accept
 

Field Detail

addlColumnMap

public Map addlColumnMap
The map of additional DOCUMENT table column name to collection of property names that map onto the column name.


addlColumnNames

public Collection addlColumnNames
The list of additional DOCUMENT table column names.


commitAfter

public int commitAfter
The number of documents loaded after which this should commit (0 or less for only at the end).


connection

protected Connection connection
The connection this loader uses.


conPoolName

public String conPoolName
The connection pool name.


DEF_MD_FILE_EXT

public static final String DEF_MD_FILE_EXT
The default file extension for metadata property files.


DEF_MIME_TYPE

public static final String DEF_MIME_TYPE
The default mime type.


DEF_SCHEMA_NAME

public static final String DEF_SCHEMA_NAME
The default schema name.


DEF_SCHEMA_PATH

public static final String DEF_SCHEMA_PATH
The default path for the schema file.


DEF_WLS_PROPS_PATH

public static final String DEF_WLS_PROPS_PATH
The default weblogic properties file path.


deleteDocSql

public static final String deleteDocSql
The preparable sql to remove a document from the database.


deleteDocStmt

protected PreparedStatement deleteDocStmt
The delete document statement.


deleteMDSql

public static final String deleteMDSql
The preparable sql to remove a document's implicit metadata from the database.


deleteMDStmt

protected PreparedStatement deleteMDStmt
The delete metadata statement.


DOC_MD_TABLE

public static String DOC_MD_TABLE
The document_metadata table name.


DOC_TABLE

public static String DOC_TABLE
The document table name.


docBase

public String docBase
The docBase.


doCleanUp

public boolean doCleanUp
Are we supposed to do a cleanup.


doDelete

public boolean doDelete
Are we deleting entries.


doMetaParse

public boolean doMetaParse
Are we supposed to parse '*.htm' and '*.html' files for META tags.


explicitAttrNames

public static final Collection explicitAttrNames
The set of explicit document metadata attribute names.


explicitColumnNames

public static final Collection explicitColumnNames
The default set of DOCUMENT table column names.


fileEncoding

public String fileEncoding
The file enconding (null for VM default).


fileList

public List fileList
The list of files/directories to scan over.


htmlMatchList

public List htmlMatchList
The list of patterns that represent HTML file names.


ignoreErrors

public boolean ignoreErrors
Do we ignore errors.


ignoreList

public List ignoreList
The list of file name patterns to ignore.


includeHidden

public boolean includeHidden
Are we supposed to include hidden files and directories.


inheritProps

public boolean inheritProps
Are we supposed to inherit metadata properties when recursing directories?


insertDocSql

public static final String insertDocSql
The default preparable sql to insert a document into the database.


insertDocStmt

protected PreparedStatement insertDocStmt
The insert document statement.


insertMDSql

public static final String insertMDSql
The preparable sql to insert a document metadata into the database.


insertMDStmt

protected PreparedStatement insertMDStmt
The insert document metadata statement.


jdbcProps

protected Properties jdbcProps
The JDBC connection properties.


jdbcUrl

protected String jdbcUrl
The JDBC connection url.


loaderFilters

public List loaderFilters
The list of LoaderFilters to try.


matchList

public List matchList
The list of file name patterns to include.

Empty to include all.


mdFileExt

public String mdFileExt
The file extension of metadata property files.

This should start with a ".".


metadataNames

protected Collection metadataNames
The metadata properties we find along the way.


myInsertDocSql

protected String myInsertDocSql
The SQL statement used to insert into the DOCUMENT table.

This will constructed in initialize() from with any additional column information.


myUpdateDocSql

protected String myUpdateDocSql
The SQL statement used to update the DOCUMENT table.

This will constructed in initialize() from with any additional column information.


numDocsLoaded

protected long numDocsLoaded
The number of documents we've loaded so far.


recurse

public boolean recurse
Do we recurse over directories?


schemaName

public String schemaName
The name of the schema in the schema file.


schemaOnly

public boolean schemaOnly
Are we doing only the schema file generation.


schemaPath

public String schemaPath
The path to the schema file to output.


testOnly

public boolean testOnly
Are we running in test mode?


truncate

public boolean truncate
Do we try to truncate.


updateDocSql

public static final String updateDocSql
The default preparable sql to update a document in the database.


updateDocStmt

protected PreparedStatement updateDocStmt
The update document statement.


verbose

public boolean verbose
Do we spew out messages.


wlsPropsPath

public String wlsPropsPath
The weblogic.properties file path.

 

Constructor Detail

BulkLoader

public BulkLoader()
Constructor a BulkLoader without command-line arguments.

BulkLoader

public BulkLoader(String[] args)
Construct a BulkLoader from the given command-line arguments.

Related Topics

BulkLoader.parseArgs(String[])

 

Method Detail

accept(File, String) Method

public boolean accept(File dir, 
                      String name)
Implement the FilenameFilter interface method to use our match and ignore lists.


close(Object) Method

public static void close(Object o)
Close an object which has a close() method, ignoring any exceptions.


commit() Method

public void commit()
Commit the transaction.


debug(String) Method

public void debug(String mesg)
Out put a debug message.

Subclasses can override this method to change where messages go.


deleteDoc(String) Method

public int deleteDoc(String path)
throws SQLException
This will remove the document with id of path from the database, including all of its implicit metadata.

Parameters

path
the document path

Returns

number of documents deleted

Exceptions

SQLException
thrown on a database error.

Related Topics

BulkLoader.deleteDocMetadata(String)


deleteDocMetadata(String) Method

public void deleteDocMetadata(String path)
throws SQLException
This will remove the implicit metadata for the specified document.

Parameters

path
the document path

Exceptions

SQLException
thrown on a database error.

Related Topics

BulkLoader.deleteDocMetadata(String)


doLoad() Method

public void doLoad()
throws SQLException
Do the actual bulk load logic on the file list.

Exceptions

SQLException

doLoad(File, String, Properties) Method

public void doLoad(File baseDir, 
                   String path, 
                   Properties dirProps)
throws SQLException
Load the given path into the database.

If path is a directory, all files underneath it that match our patterns will be included. If path is a file, it will be loaded.

Parameters

baseDir
the base directory (can be used to get absolute file paths).
path
the path to the file or directory (this can be multi-part, not just name).
dirProps
the base md properties for file (this should be a clone this method can modify as needed).

Exceptions

SQLException
thrown on a database error.

error(String, Throwable) Method

public void error(String mesg, 
                  Throwable ex)
Output an error message.

Subclasses can override this method to change where messages go.


error(String) Method

public void error(String mesg)
Output an error message.


finalize() Method

protected void finalize()
throws Throwable
Called when this to be finialized.

Overrides
Object.finalize()

Exceptions

Throwable

fixPath(String) Method

public String fixPath(String path)
Fix up a path to be forward-slash style and to not have empty path parts.


fixString(String, String, String) Method

public String fixString(String in, 
                        String tableName, 
                        String colName)
Fix up strings to check potentionally truncate too large values.


fixString(String) Method

public static String fixString(String in)
Fix empty strings to be nulls.


getLoaderFilterProperties(File, Properties) Method

public Properties getLoaderFilterProperties(File f, 
                                            Properties p)
Get the properties from the BulkLoader's LoaderFilters for the given file.

Parameters

f
the file.
p
the properties object to add to (null to create new one).

Returns

p.

getMetadataProperties(File, Properties) Method

public Properties getMetadataProperties(File base, 
                                        Properties p)
throws IOException
Get the metadata properties for the given file or directory.

This does not do a META data parse.

Parameters

base
the file or directory base path.
p
the properties to load into (null to create new).

Returns

the properties (p if p was not null).

Exceptions

IOException
on an error reading the properties file.

getPropertyForColumn(String, Properties) Method

public String getPropertyForColumn(String colName, 
                                   Properties p)
Get the property value for the specified additional column name.

This will loop over the property names mapped to the column name and return the first found value in the properties.


go() Method

public void go()
throws SQLException, IOException
Let the bulkloader go.

Exceptions

SQLException
if the load/cleanup fails.
IOException
if printing the schema fails.

initialize() Method

public void initialize()
throws SQLException, IllegalStateException
Initialize the bulkloader from the current state.

This calls setJDBCInfo() and then creates a JDBC connection. It additinal configures any known column size limitations and constructs any SQL statements it needs to.

Exceptions

SQLException
thrown on an error connecting to the database.
IllegalStateException
thrown on an initialization error.

Related Topics

BulkLoader.setJDBCInfo()


insertDoc(String, File, Properties, String) Method

public int insertDoc(String path, 
                     File f, 
                     Properties p, 
                     String mimeType)
throws SQLException
Update or insert a document and metadata into the database.

Parameters

path
the document path id.
f
the file of the document (can be null).
p
the implicit properties.
mimeType
the preferred mime type of the document (null for default).

Returns

number of documents inserted

Exceptions

SQLException
thrown on a database error.

isHidden(File) Method

public static boolean isHidden(File f)
Check if the specified file is a hidden file.

Under UNIX, the File.isHidden() reports that "/weblogicCommerce/dmsBase/." is a hidden file, which it is not. So, this fixes that problem by getting canonicals paths for directories before calling isHidden(). That seems to do the trick.


isHtmlFile(String) Method

public boolean isHtmlFile(String name)
Tell if the specified file name is an HTML file to the loader.


isReadableDirectory(String) Method

public static boolean isReadableDirectory(String name)
Check if the specified file name is a directory that we can get into.


loadPropertyColumnInfo(Properties) Method

public void loadPropertyColumnInfo(Properties p)
Load the set of jdbc.column.<columnName>=propName,... properties into our property to column information.


main(BulkLoader, String[]) Method

public static int main(BulkLoader loader, 
                       String[] args)
The main method invoked on a BulkLoader instance.

This will take a BulkLoader through the bulk loading steps. Output will be sent via the BulkLoader's debug(), warning(), and error() methods.

This will not call System.exit().

Parameters

loader
the command-line args.

Returns

the exit code (0 for success, non-zero for failure).

Related Topics

BulkLoader.parseArgs(String[])
BulkLoader.validateArgs()
BulkLoader.initialize()
BulkLoader.go()
BulkLoader.commit()


main(String[]) Method

public static void main(String[] args)
Command-line entry point.

This will call System.exit() on invalid args or error. To invoke a bulk load from your own code, create and manipulate a BulkLoader object. You can use the other main method, which does not exit.

Parameters

args
the command-line args.

Related Topics

BulkLoader.main(BulkLoader, String[])


parseArgs(String[]) Method

public void parseArgs(String[] args)
throws IllegalArgumentException
Parse the given input arguments.

Parameters

args
the input arguments.

Exceptions

IllegalArgumentException
thrown on bad arguments.

printSchema() Method

public void printSchema()
throws IOException
Print the schema xml for all the metadata we've loaded so far.

Exceptions

IOException
thrown on an error outputting the schema xml.

printSchema(PrintWriter, String) Method

public void printSchema(PrintWriter out, 
                        String enc)
throws IOException
Print the schema xml for all the metadata we've loaded so far to the given output stream.

Parameters

out
the output stream.
enc
the file encoding (will go in xml head if not null).

Exceptions

IOException
thrown on an I/O error.

rollback() Method

public void rollback()
Rollback the transaction.


setJDBCInfo() Method

protected void setJDBCInfo()
throws IllegalStateException
Set the jdbc connection url and properties from the current weblogic.properties file.

Exceptions

IllegalStateException
thrown if we cannot get the information.

shouldIgnore(String) Method

public boolean shouldIgnore(String name)
Tell if the loader should ignore the specified file name.


shouldInclude(String) Method

public boolean shouldInclude(String name)
Tell if the loader should include the specified file name.


shutdown() Method

public void shutdown()
Shutdown this bulk loader.


split(String, String) Method

public static List split(String str, 
                         String on)
Split a delimited list into a List.


splitToProperties(String, String) Method

public static Properties splitToProperties(String str, 
                                           String on)
Split a WLS connection pool style string in a Properties object of the name=value pairs.


usage(PrintStream) Method

public void usage(PrintStream out)
Print the usage of the application.


usage(PrintWriter) Method

public void usage(PrintWriter out)
Print the usage of the application.


validateArgs() Method

public void validateArgs()
throws IllegalStateException
Validate that we have been passed correct arguments.

This does not validate that the arguments are valid. That will be done in initialize().

Exceptions

IllegalStateException

warning(String, Throwable) Method

public void warning(String mesg, 
                    Throwable ex)
Output a warning message.

Subclasses can override this method to change where messages go.


warning(String) Method

public void warning(String mesg)
Output a warning message.