6.19 Singular Value Decomposition
The ore.odmSVD
function creates a model that uses the in-database Singular Value Decomposition (SVD) algorithm.
Singular Value Decomposition (SVD) is a feature extraction algorithm. SVD is orthogonal linear transformations that capture the underlying variance of the data by decomposing a rectangular matrix into three matrixes: 'U', 'D', and 'V'. Matrix 'D' is a diagonal matrix and its singular values reflect the amount of data variance captured by the bases.
Settings for a Singular Value Decomposition Models
The following table lists settings that apply to Singular Value Decomposition models.
Table 6-21 Singular Value Decomposition Model Settings
Setting Name | Setting Value | Description |
---|---|---|
|
2500 |
The maximum number of features supported by SVD. |
Example 6-22 Using the ore.odmSVD Function
IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris))
svd.mod <- ore.odmSVD(~. -Id, IRIS)
summary(svd.mod)
d(svd.mod)
v(svd.mod)
head(predict(svd.mod, IRIS, supplemental.cols = "Id"))
svd.pmod <- ore.odmSVD(~. -Id, IRIS,
odm.settings = list(odms_partition_columns = "Species"))
summary(svd.pmod)
d(svd.pmod)
v(svd.pmod)
head(predict(svd.pmod, IRIS, supplemental.cols = "Id"))
Listing for This Example
R> IRIS <- ore.push(cbind(Id = seq_along(iris[[1L]]), iris))
R>
R> svd.mod <- ore.odmSVD(~. -Id, IRIS)
R> summary(svd.mod)
Call:
ore.odmSVD(formula = ~. - Id, data = IRIS)
Settings:
value
odms.missing.value.treatment odms.missing.value.auto
odms.sampling odms.sampling.disable
prep.auto ON
scoring.mode scoring.svd
u.matrix.output u.matrix.disable
d:
FEATURE_ID VALUE
1 1 96.2182677
2 2 19.0780817
3 3 7.2270380
4 4 3.1502152
5 5 1.8849634
6 6 1.1474731
7 7 0.5814097
v:
ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' '5' '6' '7'
1 Petal.Length <NA> 0.51162932 0.65943465 -0.004420703 0.05479795 -0.51969015 0.17392232 -0.005674672
2 Petal.Width <NA> 0.16745698 0.32071102 0.146484369 0.46553390 0.72685033 0.31962337 -0.021274748
3 Sepal.Length <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847 0.31969417 -0.09379235 -0.067308615
4 Sepal.Width <NA> 0.37906736 -0.50824062 0.142810811 0.69139828 -0.25849391 -0.17606099 -0.041908520
5 Species setosa 0.03170407 -0.32247642 0.184499940 -0.12245506 -0.14348647 0.76017824 0.497502783
6 Species versicolor 0.04288799 0.04054823 -0.780684855 0.19827972 0.07363250 -0.12354271 0.571881302
7 Species virginica 0.05018593 0.16796988 0.551546107 -0.07177990 0.08109974 -0.48442099 0.647048040
Warning message:
In u.ore.odmSVD(object) : U matrix is not calculated.
R> d(svd.mod)
FEATURE_ID VALUE
1 1 96.2182677
2 2 19.0780817
3 3 7.2270380
4 4 3.1502152
5 5 1.8849634
6 6 1.1474731
7 7 0.5814097
Warning message:
ORE object has no unique key - using random order
R> v(svd.mod)
ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4' '5' '6' '7'
1 Petal.Length <NA> 0.51162932 0.65943465 -0.004420703 0.05479795 -0.51969015 0.17392232 -0.005674672
2 Petal.Width <NA> 0.16745698 0.32071102 0.146484369 0.46553390 0.72685033 0.31962337 -0.021274748
3 Sepal.Length <NA> 0.74909171 -0.26482593 -0.102057243 -0.49272847 0.31969417 -0.09379235 -0.067308615
4 Sepal.Width <NA> 0.37906736 -0.50824062 0.142810811 0.69139828 -0.25849391 -0.17606099 -0.041908520
5 Species setosa 0.03170407 -0.32247642 0.184499940 -0.12245506 -0.14348647 0.76017824 0.497502783
6 Species versicolor 0.04288799 0.04054823 -0.780684855 0.19827972 0.07363250 -0.12354271 0.571881302
7 Species virginica 0.05018593 0.16796988 0.551546107 -0.07177990 0.08109974 -0.48442099 0.647048040
Warning message:
ORE object has no unique key - using random order
R> head(predict(svd.mod, IRIS, supplemental.cols = "Id"))
Id '1' '2' '3' '4' '5' '6' '7' FEATURE_ID
1 1 0.06161595 -0.1291839 0.02586865 -0.01449182 1.536727e-05 -0.023495349 -0.007998605 2
2 2 0.05808905 -0.1130876 0.01881265 -0.09294788 3.466226e-02 0.069569113 0.051195429 2
3 3 0.05678818 -0.1190959 0.02565027 -0.01950986 8.851560e-04 0.040073030 0.060908867 2
4 4 0.05667915 -0.1081308 0.02496402 -0.02233741 -5.750222e-02 0.093904181 0.077741713 2
5 5 0.06123138 -0.1304597 0.02925687 0.02309694 -3.065834e-02 -0.030664898 -0.003629897 2
6 6 0.06747071 -0.1302726 0.03340671 0.06114966 -9.547838e-03 -0.008210224 -0.081807741 2
R>
R> svd.pmod <- ore.odmSVD(~. -Id, IRIS,
+ odm.settings = list(odms_partition_columns = "Species"))
R> summary(svd.pmod)
$setosa
Call:
ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species"))
Settings:
value
odms.max.partitions 1000
odms.missing.value.treatment odms.missing.value.auto
odms.partition.columns "Species"
odms.sampling odms.sampling.disable
prep.auto ON
scoring.mode scoring.svd
u.matrix.output u.matrix.disable
d:
FEATURE_ID VALUE
1 1 44.2872290
2 2 1.5719162
3 3 1.1458732
4 4 0.6836692
v:
ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4'
1 Petal.Length <NA> 0.2334487 0.46456598 0.8317440 -0.19463332
2 Petal.Width <NA> 0.0395488 0.04182015 0.1946750 0.97917752
3 Sepal.Length <NA> 0.8010073 0.40303704 -0.4410167 0.03811461
4 Sepal.Width <NA> 0.5498408 -0.78739486 0.2753323 -0.04331888
$versicolor
Call:
ore.odmSVD(formula = ~. - Id, data = IRIS, odm.settings = list(odms_partition_columns = "Species"))
Settings:
value
odms.max.partitions 1000
odms.missing.value.treatment odms.missing.value.auto
R> # xyz
R> d(svd.pmod)
PARTITION_NAME FEATURE_ID VALUE
1 setosa 1 44.2872290
2 setosa 2 1.5719162
3 setosa 3 1.1458732
4 setosa 4 0.6836692
5 versicolor 1 56.2523412
6 versicolor 2 1.9106625
7 versicolor 3 1.7015929
8 versicolor 4 0.6986103
9 virginica 1 66.2734064
10 virginica 2 2.4318639
11 virginica 3 1.6007740
12 virginica 4 1.2958261
Warning message:
ORE object has no unique key - using random order
R> v(svd.pmod)
PARTITION_NAME ATTRIBUTE_NAME ATTRIBUTE_VALUE '1' '2' '3' '4'
1 setosa Petal.Length <NA> 0.2334487 0.46456598 0.83174398 -0.19463332
2 setosa Petal.Width <NA> 0.0395488 0.04182015 0.19467497 0.97917752
3 setosa Sepal.Length <NA> 0.8010073 0.40303704 -0.44101672 0.03811461
4 setosa Sepal.Width <NA> 0.5498408 -0.78739486 0.27533228 -0.04331888
5 versicolor Petal.Length <NA> 0.5380908 0.49576111 -0.60174021 -0.32029352
6 versicolor Petal.Width <NA> 0.1676394 0.36693207 -0.03448373 0.91436795
7 versicolor Sepal.Length <NA> 0.7486029 -0.64738491 0.06943054 0.12516311
8 versicolor Sepal.Width <NA> 0.3492119 0.44774385 0.79492074 -0.21372297
9 virginica Petal.Length <NA> 0.5948985 -0.26368708 0.65157671 -0.38988802
10 virginica Petal.Width <NA> 0.2164036 0.59106806 0.42921836 0.64774968
11 virginica Sepal.Length <NA> 0.7058813 -0.27846153 -0.53436210 0.37235450
12 virginica Sepal.Width <NA> 0.3177999 0.70962445 -0.32507927 -0.53829342
Warning message:
ORE object has no unique key - using random order
R> head(predict(svd.pmod, IRIS, supplemental.cols = "Id"))
Id '1' '2' '3' '4' FEATURE_ID
1 1 0.1432539 -0.026487881 -0.071688339 -0.04956008 1
2 2 0.1334289 0.172689424 -0.114854368 -0.02902893 2
3 3 0.1317675 -0.008327214 -0.062409295 -0.02438248 1
4 4 0.1297716 0.075232572 0.097222019 -0.08055912 1
5 5 0.1426868 -0.102219140 -0.009172782 -0.06147133 1
6 6 0.1554060 -0.055950655 0.160698708 0.14286095 3