10.3.6 Use the ore.rowApply Function

The ore.rowApply function calls an R script with an ore.frame as the input data.

The ore.rowApply function passes the ore.frame to the user-defined input function as the first argument to that function. The rows argument to the ore.rowApply function specifies the number of rows to pass to each invocation of the user-defined R function. The last chunk or rows may have fewer rows than the number specified. The ore.rowApply function can use data-parallel execution, in which one or more R engines perform the same R function, or task, on different partitions of data.

The syntax of the ore.rowApply function is the following:

ore.rowApply(X, FUN, ..., FUN.VALUE = NULL, FUN.NAME = NULL, rows = 1, 
             FUN.OWNER = NULL, parallel = getOption("ore.parallel", NULL))

The ore.rowApply function returns an ore.list object or an ore.frame object.

Example 10-11 Using the ore.rowApply Function

This example uses the e1071 package, previously downloaded from CRAN. The example does the following:

  • Loads the package e1071.

  • Pushes the iris data set to the database as the IRIS temporary table and ore.frame object.

  • Creates the Naive Bayes model nbmod.

  • Creates a copy of IRIS as IRIS_PRED and adds the PRED column to IRIS_PRED to contain the predictions.

  • calls the ore.rowApply function, passing the IRIS ore.frame as the data source for user-defined R function and the user-defined R function itself. The user-defined function does the following:

    • Loads the package e1071 so that it is available to the R engine or engines that run in the database.

    • Converts the Species column to a factor because, although the ore.frame defined factors, when they are loaded to the user-defined function, factors appear as character vectors.

    • calls the predict method and returns the res object, which contains the predictions in the column added to the data set.

  • Pulls the model to the client R session.

  • Passes IRIS_PRED as the argument FUN.VALUE, which specifies the structure of the object that the ore.rowApply function returns.

  • Specifies the number of rows to pass to each invocation of the user-defined function.

  • Displays the class of res, and calls the table function to display the Species column and the PRED column of the res object.

%r

# Create a temporary R data.frame proxy object for the iris data.frame. 
IRIS <- ore.push(iris)

# Build a model using a data.frame
mod <- lm(Petal.Length ~ Petal.Width + Sepal.Width + Sepal.Length, data=iris)

# Save the model to the datastore
ore.save(mod, "mod", name="ds-1", overwrite=TRUE)

# Create a user-defined function that loads a model residing in the datastore and scores the model on new data.
scoreLM.1 <- function(dat, dsname){
  ore.load(dsname)
  dat$Petal.Length_prediction <- predict(mod, newdata = dat)
  dat[,c("Petal.Length_prediction","Petal.Length","Species")]
}

# Save the user-defined scoring function in the R script repository.

ore.scriptCreate(name = 'scoreLM.1', 
                 FUN  = scoreLM.1,     
                 overwrite = TRUE)
                 
# Run the scoring function in the script repository as well as specifying the desired number of parallel R engines using the parallel argument.
# View the first 6 records of the result.
                 
res1 <- ore.rowApply(IRIS, 
                    scoreLM.1,
                    dsname = "ds-1", 
                    rows = 10, 
                    parallel = 2)

head(res1)

# Run the function again, this time


res2 <- ore.rowApply(IRIS, 
                    scoreLM.1,
                    dsname = "ds-1", 
                    rows = 10, 
                    parallel = 2,
                    FUN.VALUE = data.frame(Petal.Length_prediction=numeric(),             
                                           Petal.Length=numeric(),
                                           Species=character()))

class(res2)

The output is similar to the following:

Table 10-8 A data.frame: 6 x 3

Petal.Length_prediction Petal.Length Species
<dbl> <dbl> <chr>
1 1.484210 1.4 setosa
2 1.661389 1.4 setosa
3 1.386358 1.3 setosa
4 1.378046 1.5 setosa
5 1.346695 1.4 setosa
6 1.733905 1.7 setosa
'ore.frame'