Stages of Essbase Data Loads
You can improve the performance of data loads to Essbase block storage (BSO) cubes when you understand the stages of the data load process and learn how to optimize and parallelize the tasks.
This section does not apply to aggregate storage cubes.
Loading a large source of data into an Essbase cube can take a long time. You can shorten the process by minimizing the time Essbase spends on reading and parsing the source data, as well as reading and writing to the cube.
Essbase loads data block by block. For each unique combination of sparse dimension members, one data block contains the data for all the dense dimension combinations, assuming that at least one cell contains data. For faster access to block locations, Essbase uses an index. Each entry in the index corresponds to one data block. See Sparse and Dense Dimensions, Selection of Dense and Sparse Dimensions, and Dense and Sparse Selection Scenarios.
Essbase processes the data load in a pipeline of five stages.
For free form data load, the stages are:
-
Input—Essbase collects input from file or SQL connection
-
Tokenize—Essbase separates input fields from records, creating tokens
-
Convert—Essbase converts tokens into member items
-
Preparation—Essbase arranges the data in preparation for putting it into blocks
-
Write—Essbase puts the data into blocks in memory and then writes the blocks to disk, finding the correct block on the disk by using the index, which is composed of pointers based on sparse intersections
For rules-file based data load, the stages are:
-
Input—Essbase collects input from file or SQL connection
-
Pre-Rule—Essbase reads data load records
-
Rule—Essbase applies rules, embedded in rules file, to data load records
-
Preparation—Essbase arranges the data in preparation for putting it into blocks
-
Write—Essbase puts the data into blocks in memory and then writes the blocks to disk, finding the correct block on the disk by using the index, which is composed of pointers based on sparse intersections
Note:
On aggregate storage databases, the fifth stage does not apply.
This process is repeated until all data is loaded. By using one or more processing threads in each stage, Essbase can perform some processes in parallel. See Parallel Data Load.
Examples in this chapter assume that you are familiar with the information in this topic: Sources of Data.