10.3.7.3 Simulations Use Case
You can use the ore.indexApply
function in simulations, which can take advantage of high-performance computing hardware like an Oracle Exadata Database Machine.
Example 10-14 Using the ore.indexApply Function in a Simulation
This example takes multiple samples from a random normal distribution to compare the distribution of the summary statistics. Each simulation occurs in a separate R engine in the database, in parallel, up to the degree of parallelism allowed by the database. The example defines variables for the sample size, the mean and standard deviations of the random numbers, and the number of simulations to perform. The example specifies num.simulations
as the first argument to the ore.indexApply
function. The ore.indexApply
function passes num.simulations
to the user-defined function as the index
argument. This input function then sets the random seed based on the index so that each invocation of the input function generates a different set of random numbers.
The input function next uses the rnorm
function to produce sample.size
random normal values. It calls the summary
function on the vector of random numbers, and then prepares a data.frame
as the result it returns. The ore.indexApply
function specifies the FUN.VALUE
argument so that it returns an ore.frame
that structures the combined results of the simulations. The res
variable gets the ore.frame
returned by the ore.indexApply
function.
To get the distribution of samples, the example calls the boxplot
function on the data.frame
that is the result of using the ore.pull
function to bring selected columns from res
to the client.
%r
options("ore.warn.order" = FALSE)
sample.size = 1000
mean.val = 100
std.dev.val = 10
num.simulations = 10
res <- ore.indexApply(num.simulations,
function(index, sample.size = 1000, mean = 0, std.dev = 1) {
set.seed(index)
x <- rnorm(sample.size, mean, std.dev)
ss <- summary(x)
attr.names <- attr(ss, "names")
stats <- data.frame(matrix(ss, 1, length(ss)))
names(stats) <- attr.names
stats$index <- index
stats
},
FUN.VALUE=data.frame(Min. = numeric(0),
"1st Qu." = numeric(0),
Median = numeric(0),
Mean = numeric(0),
"3rd Qu." = numeric(0),
Max. = numeric(0),
Index = numeric(0)),
parallel = TRUE,
sample.size = sample.size,
mean = mean.val, std.dev = std.dev.val)
head(res, 3)
tail(res, 3)
boxplot(ore.pull(res[, 1:6]),
main=sprintf("Boxplot of %d rnorm samples size %d, mean=%d, sd=%d",
num.simulations, sample.size, mean.val, std.dev.val))
The output is similar to the following:
Table 10-10 A data.frame: 3 x 7
Min. | X1st.Qu. | Median | Mean | X3rd.Qu. | Max. | Index |
---|---|---|---|---|---|---|
<dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> |
69.91951 | 93.02627 | 99.64676 | 99.88352 | 106.8843 | 138.1028 | 1 |
72.78184 | 93.68699 | 100.50135 | 100.61999 | 107.7106 | 130.0882 | 2 |
69.43672 | 93.15461 | 100.32338 | 100.06397 | 106.7667 | 135.1930 | 3 |
Table 10-11 A data.frame: 3 x 7
Min. | X1st.Qu. | Median | Mean | X3rd.Qu. | Max. | Index | |
---|---|---|---|---|---|---|---|
<dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
8 | 67.18068 | 92.73174 | 99.71516 | 99.58738 | 106.6340 | 129.7804 | 8 |
9 | 69.58926 | 93.51445 | 100.31074 | 100.05885 | 106.6231 | 127.6253 | 9 |
10 | 69.87836 | 93.22607 | 99.96999 | 100.11375 | 107.2746 | 135.4114 | 10 |
Listing for This Example
R> res <- ore.indexApply(num.simulations,
+ function(index, sample.size = 1000, mean = 0, std.dev = 1) {
+ set.seed(index)
+ x <- rnorm(sample.size, mean, std.dev)
+ ss <- summary(x)
+ attr.names <- attr(ss, "names")
+ stats <- data.frame(matrix(ss, 1, length(ss)))
+ names(stats) <- attr.names
+ stats$index <- index
+ stats
+ },
+ FUN.VALUE=data.frame(Min. = numeric(0),
+ "1st Qu." = numeric(0),
+ Median = numeric(0),
+ Mean = numeric(0),
+ "3rd Qu." = numeric(0),
+ Max. = numeric(0),
+ Index = numeric(0)),
+ parallel = TRUE,
+ sample.size = sample.size,
+ mean = mean.val, std.dev = std.dev.val)
R> options("ore.warn.order" = FALSE)
R> head(res, 3)
Min. X1st.Qu. Median Mean X3rd.Qu. Max. Index
1 67.56 93.11 99.42 99.30 105.8 128.0 847
2 67.73 94.19 99.86 100.10 106.3 130.7 258
3 65.58 93.15 99.78 99.82 106.2 134.3 264
R> tail(res, 3)
Min. X1st.Qu. Median Mean X3rd.Qu. Max. Index
1 65.02 93.44 100.2 100.20 106.9 134.0 5
2 71.60 93.34 99.6 99.66 106.4 131.7 4
3 69.44 93.15 100.3 100.10 106.8 135.2 3
R> boxplot(ore.pull(res[, 1:6]),
+ main=sprintf("Boxplot of %d rnorm samples size %d, mean=%d, sd=%d",
+ num.simulations, sample.size, mean.val, std.dev.val))
Figure 10-2 Display of the boxplot Function in Example 10-14

Description of "Figure 10-2 Display of the boxplot Function in Example 10-14"
Parent topic: Use the ore.indexApply Function