![]() |
![]() |
|
|
Using the Product Catalog Database Loader
WebLogic Commerce Server provides a DBLoader program that you can use to bulk load data into the product catalog database. While you could use a WebLogic Commerce Server administration screen to add new item or category data, one record at a time, this is impractical when you need to load hundreds or thousands of records. The DBLoader program is also useful if you want to load legacy data from an existing database into the WebLogic Commerce Server database.
You can also use a database vendor's specific loader program such as Oracle SQL*Loader, or a data loader by a third-party company, to populate the product catalog database.
The topic includes the following sections:
The Input File for DBLoader
The WebLogic Commerce Server DBLoader program loads data that you provide in a text file into the product catalog database. Data is loaded one table at a time; create a separate input file for each table that you want to update.
The input data file is, by default, a comma-separated value (CSV) text file. The input file has the following structure:
First Row
The header of the file must identify:
For example, the header line might contain:
130,WLCS_PRODUCT
Second Row
The second row identifies the table field (column) names into which you are loading data. You must include the primary key field or fields in the input file. Preface each primary key field name with an asterisk (*). Apart from primary keys in tables, all other fields are defaulted as null. Thus, you may omit field names where NULL is an acceptable value, and specify only those with non-NULL values.
For example, the second line of the input data file might contain:
*SKU,NAME,IN_STOCK,EST_SHIP_TIME,SPECIAL_NOTES,CREATION_DATE
Third Row
The third row specifies the data type of each field being loaded. See The Product Catalog Database Schema, for information about the product catalog schema and the datatypes used.
For example, the third line of the input data file might contain:
VARCHAR,VARCHAR,CHAR,VARCHAR,VARCHAR,DATE
Notes: On the data type line of the input file, it is not necessary to include the length of the data type, such as VARCHAR(20) or VARCHAR2(20). Simply use VARCHAR for strings. Use NUMBER instead of (for example) NUMBER(16,4). Use DOUBLE instead of DOUBLE PRECISION.
Fourth Through N Rows
All subsequent lines in the input data file contain the data values. The following is an example of a simple input file:
3,WLCS_PRODUCT
*SKU,NAME,IN_STOCK,EST_SHIP_TIME,SPECIAL_NOTES,CREATION_DATE
VARCHAR,VARCHAR,CHAR,VARCHAR,VARCHAR,DATE
P123,CoolKid,N,Out of stock until further notice,Special order only,02-Oct-2000
P124,FastKid,Y,One week,No special order,02-Oct-2000
P125,RadSneakers,Y,,regular stock,02-Oct-2000
Note: DATE field values should always be entered in the format DD-MMM-YYYY. It cannot be an empty string. Its values are either null or a valid date.
Empty input strings from the data file are inserted into database as empty strings. You must account for each unspecified field in the input record by including the delimiter character (by default, a comma) in the correct position (matching the position of the fields you listed in line 2, the field names). For example:
P125,RadSneakers,Y,,regular stock,02-Oct-2000
In the previous example a value for the fourth identified field (EST_SHIP_TIME) was not specified. This condition is fine because this field is not a primary key for the database record. The field's value is stored as an empty string.
Note: If your intention is to store a null value in the database for a non-primary-key field, you should enter NULL in the correct position for the field in that record. Do not enclose NULL in quotes; enclosing the word `NULL' in quotes will cause the field to be stored as a string.
The dbloader.properties File
The WebLogic Commerce Server DBLoader program uses a properties file named dbloader.properties to decide what driver, database, or login to use.
This file resides in the WL_COMMERCE_HOME directory. WL_COMMERCE_HOME is the directory where you installed WebLogic Commerce Server.
Comment lines are prefixed with the # character. Both comment lines and blank lines are allowed.
The following table describes the values you can set in this property file.
Listing 3-1 shows a sample dbloader.properties file.
Listing 3-1 Sample dbloader.properties File
jdbcdriver=COM.cloudscape.core.JDBCDriver
connection=jdbc:cloudscape:Commerce
dblogin=none
dbpassword=none
delimiter=,
timestamptable=wlcs_category
timestampfield=modified_date
encoding=UTF8
commitTxn=50
Running the DBLoader Program
You use the loaddata script to run the DBLoader program.
Depending on the platform you are using, the script is in one of the following directories:
The loaddata script performs the following:
Before you can run the loaddata script, make sure that the set-environment script specifies the same database as the dbloader.properties file. The set-environment script resides in the same directory as the loaddata script.
For example, if the dbloader.properties file uses `jdbc:cloudscape:Commerce' connections, then set-environment script should have SET DATABASE=CLOUDSCAPE.
As we mentioned earlier, DBLoader runs independently of WebLogic Commerce Server. Therefore you do not need to stop the server if you are planning to run the loader. However, if you are running WebLogic Commerce Server with a Cloudscape database, the database itself does not allow more than one connections at a time. In that case, you would need to stop the server.
If you are running WebLogic Commerce Server with Oracle, then the drawback might be a slower performance for the time the data is being loaded into the database.
Note: You might want to back up the particular tables that you are about to update before running DBLoader. The DBLoader program does not keep history records in the database.
To Run the Program
The command to run the program has the following format:
prompt> loaddata { -insert | -update | -delete } input-file.csv
On UNIX systems, the loaddata.sh file needs to have its default protections set to include execute privilege. A typical way to do this is with the command:
$ chmod +x loaddata.sh
You must select one of the three possible operations: -insert, -update, or -delete.
For example:
prompt> loaddata -update category.csv
In the previous example, the DBLoader program will update rows in the product catalog database that match the primary keys specified in the category.csv input file.
To insert, update, or delete data in several tables, run the loaddata script separately for each table, providing the corresponding input filename as a parameter. The order of tables being updated should use the same data integrity rules as all other SQL statements. For example, insert rows into the parent table with the primary key constraint before inserting rows into the child table with the foreign key constraint.
DBLoader Log Files
The WebLogic Commerce Server DBLoader creates two audit trail logs:
If these files do not already exist, they are created. Otherwise, the existing audit trails are overwritten by each DBLoader operation. Both files reside in the same directory where you run the loaddata script.
The dbloader.log file contains the following information:
If any errors occurred during the attempted database load operation, the dbloader.err file captures the following information:
DBLoader Validations
The DBLoader program checks the number of fields affected by the load (as specified in the second line of the input data file) against the number of input fields in each record. Because the field delimiter is a comma (by default), this character is not allowed in a string input field. If extra commas are supplied inadvertently, such as punctuation in a LONG_DESC (Long Description) field, an error will result and is noted in the dbloader.err file. To avoid this type of error, carefully check the number of commas you are using to separate the input data field values. Or select a different delimiter character and specify it in the dbloader.properties file. For more information, see the section The dbloader.properties File.
All errors and exceptions are displayed in the console where the DBLoader program is running. Records with errors in them will be skipped, and the processing continues until the end of the file. (The program does not roll back a transaction if an error has occurred.)
Important Database Considerations
This section describes some important database considerations that you should keep in mind while using the DBLoader program.
Because Cloudscape does not provide this functionality, run the loaddata script to delete records from child tables before deleting records from primary tables.
For related information, please see The Product Catalog Database Schema.
You can identify the default file encoding by checking the System property named file.encoding, as follows:
System.out.println(System.getProperty("file.encoding"));
If the file.encoding property differs from the encoding of the text data you want to process, then you must perform the conversion yourself.
Currently, the Java 2 SDK 1.2.2 can convert several files encoding into Unicode 2.0. For details, please see http://java.sun.com/products/jdk/1.2/docs/guide/internat/encoding.doc.html.
What this means to your development group:
Normally, you enter or extract data using the default encoding used by your operating system. Therefore, the case shown in item "a" in the previous list is the usual behavior.
Using Database-Specific Data Loaders
Most database management systems provide a data loader utility. In the case of Oracle, the data load utility is known as SQL*Loader. This section summarizes the capabilities of SQL*Loader. For details about SQL*Loader, see the Oracle 8i Utilities Guide. An online copy is available at http://technet.oracle.com/.
The examples used in this section are based on a simple ASCII file containing a few comma-separated values (SKU, IN_STOCK, VISIBLE, NAME) taken from a sample WLCS_PRODUCT table, which is described in The Product Catalog Database Schema.
Note: The example shown in this section does not use all of the columns from WLCS_PRODUCT. Your actual comma-separate values (CSV) file may contain more columns than we show here. We are merely attempting to show you how to conduct the import operation once you have the CSV file ready.
The name we will use for our sample data file is sample.csv. The contents of that file might look like the following (based on the columns we mentioned earlier - SKU, IN_STOCK, VISIBLE, NAME).
"0,3,4",0,0,"Growing Herbs from Seed, Cutting, and Root"
"0,2,1",0,0,"The Perfect Storm: A True Story of Men Against the Sea"
"0,2,2",0,0,"The Worst-Case Scenario Survival Handbook"
"0,4,0",0,0,"Acute Asthma: Assessment and Management"
"0,4,1",0,0,"Communications Technology Explained"
"0,4,2",0,0,"Modern Plastics Handbook"
Once you have your data file ready to populate the WLCS_PRODUCT table, you must create a control file which will be used by SQL*Loader. The control file identifies the data file to be read in, how the pieces of information are delimited in the file, and, of course, the actual column locations of the destination table. We will name our control file sample.ctl.
LOAD DATA
INFILE `sample.csv'
APPEND INTO TABLE WLCS_RPODUCT
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
(SKU, IN_STOCK, VISIBLE, NAME)
At a system prompt, invoke SQL*Loader with a command such as:
sqlldr userid=bea_systems/bea_systems control=sample.ctl log=sample.log
To review the results of your data load, see sample.log.
Using Third-Party Data Loaders
There are a variety of data loaders available on the market today to assist in the extraction and loading of information. Please be sure to research the use of these tools to ensure success within your environment.
The Data Movement module of DataStage provides a comprehensive data extraction, transformation, and loading toolset designed for building Operational Data Stores (ODS), data marts and enterprise data warehouses. For more information, please see http://www.informix.com.
PowerConnect is Informatica's family of packaged software products that helps customers easily extract data and metadata from hard-to-access ERP and other legacy applications. This data is then delivered to PowerCenter, Informatica's data integration software hub, which provides robust capabilities for transforming the data and delivering it to downstream data warehouses, data marts and analytic applications. For more information, please see http://www.informatic.com.
Data Junction is a visual design tool for rapidly integrating and transforming data between hundreds of applications and structured data formats. For more information, please see http://www.datajunction.com.
ETI·EXTRACT moves and integrates data across the value chain and multiple business processes. The product automates the writing of programs that retrieve the data needed from any system, transform it and load it into any other system while capturing a complete history of that process. For more information, please see http://www.eti.com.
![]() |
![]() |
![]() |
|
Copyright © 2001 BEA Systems, Inc. All rights reserved.
|