5 Creating Import Processor Scripts
You can develop scripts for the Import Processor to perform a wide variety of functions. Some common tasks include:
-
Skipping the importing of certain image files
-
Changing Capture batch properties
-
Skipping the importing of a batch
-
Adding page level metadata values during importing
-
After importing, moving images to a different folder
Capture enables you to create Import Processor scripts to customize the importing process. For more information, see Managing Oracle WebCenter Enterprise Capture.
This chapter contains the following sections:
5.1 Import Processor Events
Import Processor scripts are JavaScript modules that enable you to customize the behavior of certain Import Processor events.
This section describes the following Import Processor events:
5.1.1 preProcess
The preProcess event occurs prior to the pre-processing of the import source. Initialization code can be performed here. The processing can be canceled by setting the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.1.2 process
The process event signals the start of the import process.
Syntax | Parameter |
---|---|
|
5.1.3 postProcess
The postProcess event occurs after the import source has been processed.
Syntax | Parameter |
---|---|
|
5.1.4 preCreateBatch
The preCreateBatch event occurs prior to a new batch being created. The batch creation can be canceled by setting the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.1.5 postCreateBatch
The postCreateBatch event occurs immediately after a batch is created, but before any documents have been created.
Syntax | Parameter |
---|---|
|
5.1.6 preCreateDocument
The preCreateDocument event occurs prior to a new document being created. The document creation can be canceled by setting the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.1.7 postCreateDocument
The postCreateDocument event occurs after a new document has been created.
Syntax | Parameter |
---|---|
|
5.1.8 preImportFile
The preImportFile event occurs prior to a file being imported. The importing of files can be canceled by setting the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.1.9 postImportFile
The postImportFile event occurs after a file is imported.
Syntax | Parameter |
---|---|
|
5.1.10 preRelease
The preRelease event occurs prior to a batch being released. The releasing of a batch can be canceled by setting the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.1.11 postRelease
The postRelease event occurs after a batch has been released.
Syntax | Parameter |
---|---|
|
5.1.12 preDatabaseSearch
The preDatabaseSearch event occurs prior to a database lookup. A database search can be canceled by setting the cancelDBSearch property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
5.2 Email Source Events
This section describes the following email source events:
Note:
If you select the import source for emails as Microsoft Exchange Web Service, then you should invoke corresponding (getExchange) methods in the script. SeeImportProcessorContext and EmailSourceContext classes for information on new methods that have been introduced.
5.2.1 deleteMessage
The deleteMessage event occurs in the email message post-processing step when an email message is about to be deleted. To prevent the email message from being deleted, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
EmailSourceContext emailCtx |
5.2.2 moveMessage
The moveMessage event occurs in the email message post-processing step when an email message is about to be moved to an email folder. To prevent the email message from being moved, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
EmailSourceContext emailCtx |
5.2.3 newAttachment
The newAttachment event occurs when a new email attachment is about to be processed. To prevent the attachment from being imported, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
EmailSourceContext emailCtx |
5.2.4 newMessage
The newMessage event occurs when a new email message is about to be processed. To prevent the email message from being imported, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
EmailSourceContext emailCtx |
5.3 Folder Source Events
This section describes the following folder source events:
5.3.1 deleteDocumentFile
The deleteDocumentFile event occurs in the folder post-processing step when a file from the folder is about to be deleted. To prevent the document file from being deleted, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
FolderSourceContext folderCtx |
5.3.2 newFolder
The newFolder event occurs when a new folder is about to be processed. To exclude this folder from being processed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
FolderSourceContext folderCtx |
5.3.3 renameDocumentFile
The renameDocumentFile event occurs in the folder post-processing step when a file from the folder is about to be renamed. To prevent the document file from being renamed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
FolderSourceContext folderCtx |
5.4 List File Source Events
This section describes the following list file source events:
5.4.1 deleteListFile
The deleteListFile event occurs in the list file post-processing step when a list file is about to be deleted. To prevent the list file from being deleted, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
ListFileSourceContext listFileCtx |
5.4.2 newFolder
The newFolder event occurs when a new folder containing list files is about to be processed. To exclude the folder from being processed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
ListFileSourceContext listFileCtx |
5.4.3 newListFile
The newListFile event occurs when a new list file is about to be processed. To prevent the list file from being processed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
ListFileSourceContext listFileCtx |
5.4.4 newListFileLine
The newListFileLine event occurs when a new line in the list file is about to be processed. To prevent the list file line from being processed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
ListFileSourceContext listFileCtx |
5.4.5 renameListFile
The renameListFile event occurs in the list file post-processing step when a list file is about to be renamed. To prevent the list file from being renamed, set the cancel property to True in the ctx parameter.
Syntax | Parameter |
---|---|
|
ListFileSourceContext listFileCtx |
5.5 Import Processor Classes
This section describes the following Import Processor classes:
In addition to the following event classes that can be used to design Import Processor scripts, there are some common classes that pertain to the Recognition Processor and the Import Processor. For more information on the common classes, see Common Capture Classes.
5.5.1 EmailSourceContext
The EmailSourceContext class contains all classes used in the processing of an email source.
Property | Type | Description |
---|---|---|
account |
String |
Name of the email account currently being processed. |
attachmentFilename |
String |
File name of the email message attachment currently being processed. |
Standard IMAP connection uses the following properties: | ||
folder |
Folder |
Email folder currently being processed. |
message |
Message |
Email message currently being processed. |
EWS connection uses the following properties: | ||
getExchangeMessage |
microsoft.exchange.webservices.data.core.service.item.EmailMessage |
Email message currently being processed. |
getExchangeFolder |
microsoft.exchange.webservices.data.core.service.folder.Folder |
Email folder currently being processed. |
For more information on the Folder and Message class definitions, see the Javamail API documentation.
5.5.2 FolderSourceContext
The FolderSourceContext class contains all classes used in the processing of a folder source.
Property | Type | Description |
---|---|---|
folderName |
String |
Name of the directory currently being processed. |
documentFilename |
String |
Name of the file currently being processed. |
renamedDocumentFilename |
String |
If the post-processing step indicates the file should have a prefix added to it or the extension changed, this property indicates the changed file name. |
5.5.3 ImportJob
Import jobs are configured within a Capture Workspace to import batches from import sources such as a file system folder, a delimited list file, or an inbox/folder of an email server.
Property | Type | Description |
---|---|---|
jobID |
String |
A value that uniquely identifies the job in the system. |
workspaceID |
String |
Identifier of the workspace to which the job belongs. |
jobName |
String |
A human-readable name for the job. |
dbSearchID |
String |
Identifier of the database search to use when processing the job. |
dbSearchFieldID |
String |
Identifier of the database search field to use when processing the job. |
imageDownsample |
Integer |
Determines how to sample an image:
|
jpegQuality |
Integer |
The JPEG quality ratio 0 to 99. |
batchPrefix |
String |
Batch prefix to use when creating batch names. |
defaultBatchStatusID |
String |
Identifier of the batch status to associate with batches created by this job. |
defaultPriority |
Integer |
Default priority assigned to batches ranging from 0 to 10. |
defaultDocumentTypeID |
String |
Default document profile for documents created by this job. |
searchResultOption |
Integer |
Determines how to handle database lookups that return more than one result.
|
scriptID |
String |
Unique identifier of a script to use for this job. |
importFrequency |
Integer |
A value, specified in seconds, that determines how often a job should be polled for work to process. The following values are possible:
|
hour |
Integer |
If the importFrequency is set to Daily, this specifies the hour of the day. |
minute |
Integer |
If the importFrequency is set to Daily, this specifies the minute of the day. |
lastCheck |
Date |
Date or time the job was last checked for processing. This will be updated by the Import Job Scheduler after a job is polled for work to process. |
fieldMappings |
Map<String, FieldMappingInfo> |
A set of values that map Capture fields to import source metadata fields. |
importSourceClassName |
String |
Name of the Java class that provides the implementation of the import source for this job. |
batchProcessorClassName |
String |
Name of the class that will be used to process the batch when it is released. If this value is null, the batch lock will be discarded and the batch will be put in a READY state. |
batchProcessorJobID |
String |
A unique identifier for a batch processor job. If this value is null, either the processor does not support jobs or the batch is going to be put in a READY state. |
imageFailureAction |
Integer |
Specifies the action to be taken if an invalid image is encountered:
|
locale |
Locale |
Specifies the locale of the list file source. |
defaultDateFormat |
String |
Specifies the default date format of dates in the list file source. |
description |
String |
Description of this job. |
encoding |
String |
Specifies the file encoding of the list file source. |
isJobOnline |
Boolean |
Indicates whether this job should be processed. |
preserveImageFiles |
Boolean |
If True, prevents image files from being altered during import. |
5.5.4 ImportProcessorContext
The ImportProcessorContext class contains properties relevant to the job being processed. An instance of this class is created before processing is started and is passed to an import source at various stages throughout processing.
Property | Type | Description |
---|---|---|
cancel |
Boolean |
When this boolean value is set to True, it will cancel the operation being performed. |
cancelDBSearch |
Boolean |
When this boolean value is set to True, it will cancel the database lookup. |
dBSearchResults |
Contains the results from a database lookup. |
|
sourceName |
String |
Name of the import source that the current Import Job is configured to use. |
logger |
Logger |
An instance of java.util.logging.Logger that can be used to log additional entries. |
importCancelAction |
Integer |
This property specifies the action to be taken if a script sets the cancel property to True in the preImportFile event. The value may be set to one of the following constants:
|
importJob |
Current Import Job being processed. |
|
batchLock |
Contains the batch lock entity for the batch, after a new batch has been created. |
|
importSourceFile |
String |
Name of the file currently being processed. |
documentEntity |
Document entity associated with the file currently being processed. |
|
documentPageEntity |
Document page entity associated with the file currently being processed. |
|
lastMultiPageTiffNumber |
Integer |
Contains the current page number of a multi-page TIFF file being processed. |
workspaceEntity |
Workspace entity associated with the current batch. |
|
batchManager |
Batch manager object used for batch operations. |
|
isExchangeMail |
Boolean |
Checks whether the current email import job is using exchange web service APIs. |
5.5.5 ListFileSourceContext
The ListFileSourceContext class contains all classes used in the processing of a list file source.
Property | Type | Description |
---|---|---|
folderName |
String |
Name of the folder currently being processed. |
listFilename |
String |
Name of the list file currently being processed. |
listFileLine |
String |
Contents of the line currently being processed in the list file. |
documentFilename |
String |
Name of the file currently being processed from the current line in the list file. |
renamedListFilename |
String |
If the post-processing step indicates the list file should have a prefix added to it or the extension changed, this property indicates the changed list file name. |
5.5.6 Sample Import Processor Scripts
The section describes the following sample Import Processor scripts:
5.5.6.1 Sample Import Processor Script 1
The following sample script sets each document's title to the name of the file being imported. When the documents are later committed, their document title can be mapped to an output field.
importClass(java.io.File); function preCreateDocument(event) { // ImportProcessorContext var document; // DocumentEntity var sourceFile; // File sourceFile = new File(event.getImportSourceFile()); document = event.getDocumentEntity(); // Set the document title to be the name of the source file document.setDocumentTitle(sourceFile.getName()); }
5.5.6.2 Sample Import Processor Script 2
The following sample script demonstrates using the preCreateDocument event to obtain the base file name of the file being imported and assign that name to a metadata field. In addition, this script shows how to look up the definition of a metadata field by name, locate and create an IndexValue, and set the value of an IndexValue.
function preCreateDocument(ctx) { // Get the base name of the file being imported. var sourceFile = new java.io.File(ctx.getImportSourceFile()); var baseFileName = sourceFile.getName(); // Strip off any file extension. var dotPos = baseFileName.lastIndexOf('.'); if (dotPos > -1) baseFileName = baseFileName.substring(0, dotPos); // Update the "File Name" metadata field with the base name of the file. updateIndex(ctx, "File Name", baseFileName); } // Update a metadata field function updateIndex(ctx, indexName, commitValue) { var doc = ctx.getDocumentEntity(); var workspace = ctx.getWorkspaceEntity(); // Locate the index definition object by the index name. var indexDef = findIndexDefinitionByName(workspace, indexName); if (indexDef != null) { // Get the ID for the index field. var indexID = indexDef.getIndexFieldID(); // Get the index value object for the given document and index ID. var indexValue = getIndexValue(doc, indexID); // Set the commit value for the index field. indexValue.setFieldValue(commitValue); } } // Search the workspace to find the index definition by name function findIndexDefinitionByName(workspace, indexName) { var indexDefs = workspace.getIndexDefinitions(); var size = indexDefs.size(); var foundIndexDef = null; for (var i = 0; i < size; i++) { var indexDef = indexDefs.get(i); if (indexName.equals(indexDef.getFieldName())) { // An index by this name was found. foundIndexDef = indexDef; break; } } return foundIndexDef; } // Search the index values of the document for an IndexValue object with the given ID. // If one is found, return it; Otherwise, create one and return it. function getIndexValue(doc, indexDefID) { // Look through all existing document indexes to see if our index is present. var indexes = doc.getIndexes(); var size = indexes.size(); var foundIndexValue = null; for (var i = 0; i < size; i++) { var indexValue = indexes.get(i); if (indexDefID.equals(indexValue.getFieldID())) { // An index by this ID was found. foundIndexValue = indexValue; break; } } if (foundIndexValue == null) { // The index value wasn't found, so create one with blank values. foundIndexValue = new Packages.oracle.odc.data.IndexValue(indexDefID, "", ""); // Add it to the document's index collection. indexes.add(foundIndexValue); } // Return the IndexValue object. return foundIndexValue; }