4.3 Shared Settings
These settings are common to all of the OML4R machine learning classes.
The following table lists the settings that are shared by all Oracle Machine Learning for R models.
Table 4-1 Shared Model Settings
Setting Name | Setting Value | Description |
---|---|---|
|
|
Helps to control model size in the database. Model
details can consume significant disk space, especially for
partitioned models. The default value is
If the setting value is If the value is The reduction in the space depends on the algorithm. Model size reduction can be on the order of 10x . |
|
1 < value <=
1000000 |
Controls the maximum number of partitions allowed for a
partitioned model. The default is |
|
|
Indicates how to treat missing values in the training
data. This setting does not affect the scoring data. The default
value is
When The value |
|
|
Controls the parallel building of partitioned models.
|
|
Comma separated list of machine learning attributes |
Requests the building of a partitioned model. The setting
value is a comma-separated list of the machine learning attributes to be used to determine the
in-list partition key values. These attributes are taken from the
input columns, unless an |
|
tablespace_name |
Specifies the tablespace in which to store the model. If you explicitly set this to the name of a tablespace (for which you have sufficient quota), then the specified tablespace storage creates the resulting model content. If you do not provide this setting, then the your default tablespace creates the resulting model content. |
|
0 < value |
Determines how many rows to sample (approximately). You
can use this setting only if |
|
|
Allows the user to request sampling of the build data.
The default is |
|
|
The maximum number of distinct features, across all text
attributes, to use from a document set passed to the model. The
default is |
|
Non-negative value |
This text processing setting controls how many documents a token needs to appear in to be used as a feature. The default is |
|
The name of an Oracle Text POLICY created using
|
Affects how individual tokens are extracted from unstructured text. For details about
|
PREP_AUTO |
|
This data preparation setting enables fully automated data preparation. The default is |
PREP_SCALE_2DNUM |
p
|
This data preparation setting enables scaling data
preparation for two-dimensional numeric columns.
|
PREP_SCALE_NNUM |
|
This data preparation setting enables scaling data
preparation for nested numeric columns. |
PREP_SHIFT_2DNUM |
|
This data preparation setting enables centering data
preparation for two-dimensional numeric columns.
|
ODMS_BOXCOX |
ODMS_BOXCOX_ENABLE ODMS_BOXCOX_DISABLE |
This setting enables the Box-Cox variance-stabilization transformation. It is useful when the variance increases as the target value increases. It reduces variance and transforms a multiplicative relationship with the target, with a simpler additive relationship. This setting is applicable only to the Exponential Smoothing algorithm. When a value for EXSM_MODEL setting is not specified, the default value is ODMS_BOXCOX_ENABLE and when a value for the EXSM_MODEL setting is provided, the default value is ODMS_BOXCOX_DISABLE .
|
ODMS_EXPLOSION_MIN_SUPP |
X >= 0 |
It is the minimum required support for categorical values that must be included in the explosion mapping. It removes categorical values with insufficient row instances to have a statistically significant effect on the model, because, they could potentially degrade performance or exhaust memory. The default is system determined depending on the number of rows in the dataset. A value of 1 results into mapping all categorical values.
|
Parent topic: Reference