9.3.5.1 Partition on a Single Column
This example uses the ore.groupApply
function and partitions the data on a single column.
Example 9-9 Using the ore.groupApply Function
Create a user-defined function that builds and returns a model using R’s lm() function.
%r
buildLM.group <- function(dat){
mod <- lm(Petal.Length~Petal.Width, dat)
return(mod)
}
# Run the user-defined function on the local iris data.frame
res1 <- buildLM.group(iris)
res1
# Create a temporary R data.frame proxy object IRIS and run the user-defined function using ore.tableApply. The function name is passed to the FUN argument.
IRIS <- ore.push(iris)
# Use ore.groupApply to build one model for each of the three categories in the Species variable as well as specifying the desired number of parallel R engines using the parallel argument.
# We build three models and return them.
res2 <- ore.groupApply(IRIS[,c("Petal.Length","Petal.Width","Species")],
INDEX = IRIS$Species,
buildLM.group,
parallel = 3)
res2
# Save the user-defined function to the R script repository with the same name. Run the function stored in the script repository using ore.tableApply.
# The script name is passed to the FUN.NAME argument. Overwrite any script with the same name if it exits.
ore.scriptCreate(name = 'buildLM.group',
FUN = buildLM.group,
overwrite = TRUE)
res3 <- ore.groupApply(IRIS[,c("Petal.Length","Petal.Width","Species")],
INDEX = IRIS$Species,
buildLM.group,
parallel = 3)
res3
The output is similar to the following:
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
1.084 2.230
Warning message:
“Parallelism exceeds the DOP limit 2 (reverting to parallel=2)”
$setosa
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
1.3276 0.5465
$versicolor
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
1.781 1.869
$virginica
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
4.2407 0.6473
Warning message:
“Parallelism exceeds the DOP limit 2 (reverting to parallel=2)”
$setosa
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
1.3276 0.5465
$versicolor
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
1.781 1.869
$virginica
Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)
Coefficients:
(Intercept) Petal.Width
4.2407 0.6473
Parent topic: Use the ore.groupApply Function