10.3.5.1 Partition on a Single Column

This example uses the ore.groupApply function and partitions the data on a single column.

Example 10-9 Using the ore.groupApply Function

Create a user-defined function that builds and returns a model using R’s lm() function.

%r

buildLM.group <- function(dat){
  mod <- lm(Petal.Length~Petal.Width, dat)
  return(mod)
}

# Run the user-defined function on the local iris data.frame

res1 <- buildLM.group(iris)
res1

# Create a temporary R data.frame proxy object IRIS and run the user-defined function using ore.tableApply. The function name is passed to the FUN argument.

IRIS <- ore.push(iris)

# Use ore.groupApply to build one model for each of the three categories in the Species variable as well as specifying the desired number of parallel R engines using the parallel argument.
# We build three models and return them.

res2 <- ore.groupApply(IRIS[,c("Petal.Length","Petal.Width","Species")], 
                      INDEX = IRIS$Species, 
                      buildLM.group,
                      parallel = 3)
res2                      
                                           
# Save the user-defined function to the R script repository with the same name. Run the function stored in the script repository using ore.tableApply.
# The script name is passed to the FUN.NAME argument. Overwrite any script with the same name if it exits.

ore.scriptCreate(name = 'buildLM.group', 
                 FUN  =  buildLM.group,     
                 overwrite = TRUE)


res3 <- ore.groupApply(IRIS[,c("Petal.Length","Petal.Width","Species")], 
                      INDEX = IRIS$Species, 
                      buildLM.group,
                      parallel = 3)
res3

The output is similar to the following:

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
      1.084        2.230  
Warning message:
“Parallelism exceeds the DOP limit 2 (reverting to parallel=2)”
$setosa

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
     1.3276       0.5465  


$versicolor

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
      1.781        1.869  


$virginica

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
     4.2407       0.6473  

Warning message:
“Parallelism exceeds the DOP limit 2 (reverting to parallel=2)”
$setosa

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
     1.3276       0.5465  


$versicolor

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
      1.781        1.869  


$virginica

Call:
lm(formula = Petal.Length ~ Petal.Width, data = dat)

Coefficients:
(Intercept)  Petal.Width  
     4.2407       0.6473