11 Colt Aggregate Functions
A reference to Colt aggregate functions provided in Oracle Continuous Query Language (Oracle CQL) is provided. Colt aggregate functions are based on the Colt open source libraries for high performance scientific and technical computing.
For more information, see Functions.
11.1 Introduction to Oracle CQL Built-In Aggregate Colt Functions
Table 11-1 lists the built-in aggregate Colt functions that Oracle CQL provides.
Table 11-1 Oracle CQL Built-in Aggregate Colt-Based Functions
Colt Package | Function |
---|---|
A set of basic descriptive statistics functions. |
|
Note:
Built-in function names are case sensitive and you must use them in the case shown (in lower case).
Note:
In stream input examples, lines beginning with h
(such as h 3800
) are heartbeat input tuples. These inform GGSA that no further input will have a timestamp lesser than the heartbeat value.
In relation output examples, the first tuple output is:
-9223372036854775808:+
This value is -Long.MIN_VALUE()
and represents the largest negative timestamp possible.
For more information, see:
11.1.1 Oracle CQL Colt Aggregate Function Signatures and Tuple Arguments
Note that the signatures of the Oracle CQL Colt aggregate functions do not match the signatures of the corresponding Colt aggregate functions.
Consider the following Colt aggregate function:
double autoCorrelation(DoubleArrayList data, int lag, double mean, double variance)
In this signature, data
is the Collection
over which aggregates will be calculated and mean
and variance
are the other two parameter aggregates which are required to calculate autoCorrelation
(where mean
and variance
aggregates are calculated on data
).
In GGSA, data
will never come in the form of a Collection
. The Oracle CQL function receives input data in a stream of tuples.
So suppose our stream is defined as S:(double val, integer lag)
. On each input tuple, the Oracle CQL autoCorrelation
function will compute two intermediate aggregates, mean
and variance
, and one final aggregate, autoCorrelation
.
Since the function expects a stream of tuples having a double
data
value and an integer
lag
value only, the signature of the Oracle CQL autoCorrelation
function is:
double autoCorrelation (double data, int lag)
11.1.2 Colt Aggregate Functions and the Where, Group By, and Having Clauses
In Oracle CQL, the where
clause is applied before the group by
and having
clauses. This means the Oracle CQL statement is invalid:
<query id="q1"><![CDATA[ select * from InputChannel[rows 4 slide 4] as ic where geometricMean(c3) > 4 ]]></query>
Instead, you must use the Oracle CQL statement shown in the following example:
<query id="q1"><![CDATA[ select * from InputChannel[rows 4 slide 4] as ic, myGeoMean = geometricMean(c3) where myGeoMean > 4 ]]></query>
For more information, see:
11.2 autoCorrelation
Syntax

Purpose
autoCorrelation
is based on cern.jet.stat.Descriptive.autoCorrelation(DoubleArrayList data, int lag, double mean, double variance)
. It returns the auto-correlation of a data sequence of the input arguments as a double
.
Note:
This function has semantics different from lag1.
This function takes the following tuple arguments:
-
double1
: data value. -
int1
: lag.
For more information, see
Examples
Consider the query qColtAggr1
. Given the data stream SColtAggrFunc
with schema (c3 double)
, the query returns the relation.
<query id="qColtAggr1"><![CDATA[ select autoCorrelation(c3, 0) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 5.441341838866902 1000 6.1593756700951054 1200 3.7269733222923676 1400 4.625160266213489 1600 3.490061774090248 1800 3.6354484064421917 2000 5.635401664977703 2200 5.006087562207967 2400 3.632574304861612 2600 7.618087248962962 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + 1.0 1200: - 1.0 1200: + 1.0 1400: - 1.0 1400: + 1.0 1600: - 1.0 1600: + 1.000000000000002 1800: - 1.000000000000002 1800: + 1.0 2000: - 1.0 2000: + 0.9999999999999989 2200: - 0.9999999999999989 2200: + 0.999999999999999 2400: - 0.999999999999999 2400: + 0.9999999999999991 2600: - 0.9999999999999991 2600: + 1.0000000000000013
11.3 correlation
Syntax

Purpose
correlation
is based on cern.jet.stat.Descriptive.correlation(DoubleArrayList data1, double standardDev1, DoubleArrayList data2, double standardDev2)
. It returns the correlation of two data sequences of the input arguments as a double
.
This function takes the following tuple arguments:
-
double1
: data value 1. -
double2
: data value 2.
For more information, see
Examples
Consider the query qColtAggr2
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr2"><![CDATA[ select correlation(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + 2.0 1200: - 2.0 1200: + 1.5 2000: - 1.5 2000: + 1.333333333333333
11.4 covariance
Syntax

Purpose
covariance
is based on cern.jet.stat.Descriptive.covariance(DoubleArrayList data1, DoubleArrayList data2)
. It returns the correlation of two data sequences (see Figure 11-1) of the input arguments as a double
.
Figure 11-1 cern.jet.stat.Descriptive.covariance

This function takes the following tuple arguments:
-
double1
: data value 1. -
double2
: data value 2.
For more information, see:
Examples
Consider the query qColtAggr3
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr3"><![CDATA[ select covariance(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + 50.0 1200: - 50.0 1200: + 100.0 2000: - 100.0 2000: + 166.66666666666666
11.5 geometricMean
Syntax

Purpose
geometricMean
is based on cern.jet.stat.Descriptive.geometricMean(DoubleArrayList data)
. It returns the geometric mean of a data sequence (see Figure 11-2) of the input argument as a double
.
Figure 11-2 cern.jet.stat.Descriptive.geometricMean(DoubleArrayList data)

This function takes the following tuple arguments:
-
double1
: data value.
Note that for a geometric mean to be meaningful, the minimum of the data values must not be less than or equal to zero.
For more information, see:
Examples
Consider the query qColtAggr6
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr6"><![CDATA[ select geometricMean(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 34.64101615137755 1200: - 34.64101615137755 1200: + 28.844991406148168 2000: - 28.844991406148168 2000: + 22.133638394006436
11.6 geometricMean1
Syntax

Purpose
geometricMean1
is based on cern.jet.stat.Descriptive.geometricMean(double sumOfLogarithms)
. It returns the geometric mean of a data sequence (see Figure 11-3) of the input arguments as a double
.
Figure 11-3 cern.jet.stat.Descriptive.geometricMean1(int size, double sumOfLogarithms)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr7
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr7"><![CDATA[ select geometricMean1(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + Infinity 1000: - Infinity 1000: + Infinity 1200: - Infinity 1200: + Infinity 2000: - Infinity 2000: + Infinity
11.7 harmonicMean
Syntax

Purpose
harmonicMean
is based on cern.jet.stat.Descriptive.harmonicMean(int size, double sumOfInversions)
. It returns the harmonic mean of a data sequence as a double
.
This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr8
. Given the data stream SColtAggrFunc
with schema (c3 double)
, the query returns the relation.
<query id="qColtAggr8"><![CDATA[ select harmonicMean(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 5.441341838866902 1000 6.1593756700951054 1200 3.7269733222923676 1400 4.625160266213489 1600 3.490061774090248 1800 3.6354484064421917 2000 5.635401664977703 2200 5.006087562207967 2400 3.632574304861612 2600 7.618087248962962 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 5.441341876983643 1000: - 5.441341876983643 1000: + 5.778137193205395 1200: - 5.778137193205395 1200: + 4.882442561720335 1400: - 4.882442561720335 1400: + 4.815475325819701 1600: - 4.815475325819701 1600: + 4.475541862878903 1800: - 4.475541862878903 1800: + 4.309563447664887 2000: - 4.309563447664887 2000: + 4.45944509362759 2200: - 4.45944509362759 2200: + 4.5211563834502515 2400: - 4.5211563834502515 2400: + 4.401525382790638 2600: - 4.401525382790638 2600: + 4.595562422157167
11.8 kurtosis
Syntax

Purpose
kurtosis
is based on cern.jet.stat.Descriptive.kurtosis(DoubleArrayList data, double mean, double standardDeviation)
. It returns the kurtosis or excess (see Figure 11-4) of a data sequence as a double
.
Figure 11-4 cern.jet.stat.Descriptive.kurtosis(DoubleArrayList data, double mean, double standardDeviation)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr12
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr12"><![CDATA[ select kurtosis(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + -2.0 1200: - -2.0 1200: + -1.5000000000000002 2000: - -1.5000000000000002 2000: + -1.3600000000000003
11.9 lag1
Syntax

Purpose
lag1
is based on cern.jet.stat.Descriptive.lag1(DoubleArrayList data, double mean)
. It returns the lag - 1
auto-correlation of a dataset as a double
.
Note:
This function has semantics different from autoCorrelation.
This function takes the following tuple arguments:
-
double1
: data value.
For more information, see
Examples
Consider the query qColtAggr14
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr14"><![CDATA[ select lag1(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + -0.5 1200: - -0.5 1200: + 0.0 2000: - 0.0 2000: + 0.25
11.10 mean
Syntax

Purpose
mean
is based on cern.jet.stat.Descriptive.mean(DoubleArrayList data)
. It returns the arithmetic mean of a data sequence (see Figure 11-5) as a double
.
Figure 11-5 cern.jet.stat.Descriptive.mean(DoubleArrayList data)

The following table lists the input types and the corresponding output types:
Input Types |
Output Types |
INT | DOUBLE |
BIGINT | DOUBLE |
FLOAT | DOUBLE |
DOUBLE | DOUBLE |
For more information, see:
Examples
Consider the query qColtAggr16
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr16"><![CDATA[ select mean(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.0 1200: - 35.0 1200: + 30.0 2000: - 30.0 2000: + 25.0
11.11 meanDeviation
Syntax

Purpose
meanDeviation
is based on cern.jet.stat.Descriptive.meanDeviation(DoubleArrayList data, double mean)
. It returns the mean deviation of a dataset (see Figure 11-6) as a double
.
Figure 11-6 cern.jet.stat.Descriptive.meanDeviation(DoubleArrayList data, double mean)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see
Examples
Consider the query qColtAggr17
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr17"><![CDATA[ select meanDeviation(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 5.0 1200: - 5.0 1200: + 6.666666666666667 2000: - 6.666666666666667 2000: + 10.0
11.12 median
Syntax

Purpose
median
is based on cern.jet.stat.Descriptive.median(DoubleArrayList sortedData)
. It returns the median of a sorted data sequence as a double
.
The following table lists the input types and the corresponding output types:
Table 11-2 Input and Output Types
Input Types | Output Types |
---|---|
INT |
DOUBLE |
BIGINT |
DOUBLE |
FLOAT |
DOUBLE |
DOUBLE |
DOUBLE |
Note:
If the input type is INT
, then return type will also be INT
and it will be floor of the divided value.
For more information, see:
Examples
Consider the query qColtAggr18
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr18"><![CDATA[ select median(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.0 1200: - 35.0 1200: + 30.0 2000: - 30.0 2000: + 25.0
11.13 moment
Syntax

Purpose
moment
is based on cern.jet.stat.Descriptive.moment(DoubleArrayList data, int k, double c)
. It returns the moment of the k
-th order with constant c
of a data sequence (see Figure 11-7) as a double
.
Figure 11-7 cern.jet.stat.Descriptive.moment(DoubleArrayList data, int k, double c)

This function takes the following tuple arguments:
-
double1
: data value. -
int1
:k
. -
double2
:c
.
For more information, see:
Examples
Consider the query qColtAggr21
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr21"><![CDATA[ select moment(c3, c1, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 5000.0 1200: - 5000.0 1200: + 3000.0 2000: - 3000.0 2000: + 1.7045E11
11.14 pooledMean
Syntax

Purpose
pooledMean
is based on cern.jet.stat.Descriptive.pooledMean(int size1, double mean1, int size2, double mean2)
. It returns the pooled mean of two data sequences (see Figure 11-8) as a double
.
Figure 11-8 cern.jet.stat.Descriptive.pooledMean(int size1, double mean1, int size2, double mean2)

This function takes the following tuple arguments:
-
double1
: mean 1. -
double2
: mean 2.
For more information, see
Examples
Consider the query qColtAggr22
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr22"><![CDATA[ select pooledMean(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.0 1200: - 35.0 1200: + 30.0 2000: - 30.0 2000: + 25.0
11.15 pooledVariance
Syntax

Purpose
pooledVariance
is based on cern.jet.stat.Descriptive.pooledVariance(int size1, double variance1, int size2, double variance2)
. It returns the pooled variance of two data sequences (see Figure 11-9) as a double
.
Figure 11-9 cern.jet.stat.Descriptive.pooledVariance(int size1, double variance1, int size2, double variance2)

This function takes the following tuple arguments:
-
double1
: variance 1. -
double2
: variance 2.
For more information, see
Examples
Consider the query qColtAggr23
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr23"><![CDATA[ select pooledVariance(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 25.0 1200: - 25.0 1200: + 66.66666666666667 2000: - 66.66666666666667 2000: + 125.0
11.16 product
Syntax

Purpose
product
is based on cern.jet.stat.Descriptive.product(DoubleArrayList data)
. It returns the product of a data sequence (see Figure 11-10) as a double
.
Figure 11-10 cern.jet.stat.Descriptive.product(DoubleArrayList data)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr24
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr24"><![CDATA[ select product(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 1200.0 1200: - 1200.0 1200: + 24000.0 2000: - 24000.0 2000: + 240000.0
11.17 quantile
Syntax

Purpose
quantile
is based on cern.jet.stat.Descriptive.quantile(DoubleArrayList sortedData, double phi)
. It returns the phi-quantile as a double
; that is, an element elem
for which holds that phi percent of data elements are less than elem
.
This function takes the following tuple arguments:
-
double1
: data value. -
double2
: phi; the percentage; must satisfy0 <= phi <= 1
.
For more information, see:
Examples
Consider the query qColtAggr26
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr26"><![CDATA[ select quantile(c3, c2) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 36.99999988079071 1200: - 36.99999988079071 1200: + 37.799999713897705 2000: - 37.799999713897705 2000: + 22.000000178813934
11.18 quantileInverse
Syntax

Purpose
quantileInverse
is based on cern.jet.stat.Descriptive.quantileInverse(DoubleArrayList sortedList, double element)
. It returns the percentage phi of elements <= element
(0.0 <= phi <= 1.0
) as a double
. This function does linear interpolation if the element
is not contained but lies in between two contained elements.
This function takes the following tuple arguments:
-
double1
: data. -
double2
:element
.
For more information, see:
Examples
Consider the query qColtAggr27
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr27"><![CDATA[ select quantileInverse(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 1.0 1000: - 1.0 1000: + 0.5 1200: - 0.5 1200: + 0.3333333333333333 2000: - 0.3333333333333333 2000: + 0.25
11.19 rankInterpolated
Syntax

Purpose
rankInterpolated
is based on cern.jet.stat.Descriptive.rankInterpolated(DoubleArrayList sortedList, double element)
. It returns the linearly interpolated number of elements in a list less or equal to a given element
as a double.
The rank is the number of elements <= element
. Ranks are of the form{0, 1, 2,..., sortedList.size()}
. If no element is <= element
, then the rank is zero. If the element lies in between two contained elements, then linear interpolation is used and a non-integer value is returned.
This function takes the following tuple arguments:
-
double1
: data value. -
double2
:element
.
For more information, see:
Examples
Consider the query qColtAggr29
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr29"><![CDATA[ select rankInterpolated(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 1.0 1000: - 1.0 1000: + 1.0 1200: - 1.0 1200: + 1.0 2000: - 1.0 2000: + 1.0
11.20 rms
Syntax

Purpose
rms
is based on cern.jet.stat.Descriptive.rms(int size, double sumOfSquares)
. It returns the Root-Mean-Square (RMS) of a data sequence (see Figure 11-11) as a double
.
Figure 11-11 cern.jet.stat.Descriptive.rms(int size, double sumOfSquares)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see
Examples
Consider the query qColtAggr30
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr30"><![CDATA[ select rms(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.35533905932738 1200: - 35.35533905932738 1200: + 31.09126351029605 2000: - 31.09126351029605 2000: + 27.386127875258307
11.21 sampleKurtosis
Syntax

Purpose
sampleKurtosis
is based on cern.jet.stat.Descriptive.sampleKurtosis(DoubleArrayList data, double mean, double sampleVariance)
. It returns the sample kurtosis (excess) of a data sequence as a double
.
This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr31
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr31"><![CDATA[ select sampleKurtosis(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + NaN 1200: - NaN 1200: + NaN 2000: - NaN 2000: + -1.1999999999999993
11.22 sampleKurtosisStandardError
Syntax

Purpose
sampleKurtosisStandardError
is based on cern.jet.stat.Descriptive.sampleKurtosisStandardError(int size)
. It returns the standard error of the sample Kurtosis as a double
.
This function takes the following tuple arguments:
-
int1
: data value.
For more information, see:
Examples
Consider the query qColtAggr33
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr33"><![CDATA[ select sampleKurtosisStandardError(c1) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + Infinity 1200: - Infinity 1200: + Infinity 2000: - Infinity 2000: + 2.6186146828319083
11.23 sampleSkew
Syntax

Purpose
sampleSkew
is based on cern.jet.stat.Descriptive.sampleSkew(DoubleArrayList data, double mean, double sampleVariance)
. It returns the sample skew of a data sequence as a double
.
This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr34
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr34"><![CDATA[ select sampleSkew(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + NaN 1200: - NaN 1200: + 0.0 2000: - 0.0 2000: + 0.0
11.24 sampleSkewStandardError
Syntax

Purpose
sampleSkewStandardError
is based on cern.jet.stat.Descriptive.sampleSkewStandardError(int size)
. It returns the standard error of the sample skew as a double
.
This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr36
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr36"><![CDATA[ select sampleSkewStandardError(c1) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + -0.0 1000: - -0.0 1000: + Infinity 1200: - Infinity 1200: + 1.224744871391589 2000: - 1.224744871391589 2000: + 1.01418510567422
11.25 sampleVariance
Syntax

Purpose
sampleVariance
is based on cern.jet.stat.Descriptive.sampleVariance(DoubleArrayList data, double mean)
. It returns the sample variance of a data sequence (see Figure 11-12) as a double
.
Figure 11-12 cern.jet.stat.Descriptive.sampleVariance(DoubleArrayList data, double mean)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr38
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr38"><![CDATA[ select sampleVariance(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + 50.0 1200: - 50.0 1200: + 100.0 2000: - 100.0 2000: + 166.66666666666666
11.26 skew
Syntax

Purpose
skew
is based on cern.jet.stat.Descriptive.skew(DoubleArrayList data, double mean, double standardDeviation)
. It returns the skew of a data sequence of a data sequence (see Figure 11-13) as a double
.
Figure 11-13 cern.jet.stat.Descriptive.skew(DoubleArrayList data, double mean, double standardDeviation)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr41
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr41"><![CDATA[ select skew(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + NaN 1000: - NaN 1000: + 0.0 1200: - 0.0 1200: + 0.0 2000: - 0.0 2000: + 0.0
11.27 standardDeviation
Syntax

Purpose
standardDeviation
is based on cern.jet.stat.Descriptive.standardDeviation(double variance)
. It returns the standard deviation from a variance as a double
.
The following table lists the input types and the corresponding output types:
Input Types |
Output Types |
INT | DOUBLE |
BIGINT | DOUBLE |
FLOAT | DOUBLE |
DOUBLE | DOUBLE |
For more information, see
Examples
Consider the query qColtAggr44
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr44"><![CDATA[ select standardDeviation(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 5.0 1200: - 5.0 1200: + 8.16496580927726 2000: - 8.16496580927726 2000: + 11.180339887498949
11.28 standardError
Syntax

Purpose
standardError
is based on cern.jet.stat.Descriptive.standardError(int size, double variance)
. It returns the standard error of a data sequence (see Figure 11-14) as a double
.
Figure 11-14 cern.jet.stat.Descriptive.cern.jet.stat.Descriptive.standardError(int size, double variance)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see
Examples
Consider the query qColtAggr45
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr45"><![CDATA[ select standardError(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 3.5355339059327378 1200: - 3.5355339059327378 1200: + 4.714045207910317 2000: - 4.714045207910317 2000: + 5.5901699437494745
11.29 sumOfInversions
Syntax

Purpose
sumOfInversions
is based on cern.jet.stat.Descriptive.sumOfInversions(DoubleArrayList data, int from, int to)
. It returns the sum of inversions of a data sequence (see Figure 11-15) as a double
.
Figure 11-15 cern.jet.stat.Descriptive.sumOfInversions(DoubleArrayList data, int from, int to)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr48
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr48"><![CDATA[ select sumOfInversions(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.025 1000: - 0.025 1000: + 0.058333333333333334 1200: - 0.058333333333333334 1200: + 0.10833333333333334 2000: - 0.10833333333333334 2000: + 0.20833333333333334
11.30 sumOfLogarithms
Syntax

Purpose
sumOfLogarithms
is based on cern.jet.stat.Descriptive.sumOfLogarithms(DoubleArrayList data, int from, int to)
. It returns the sum of logarithms of a data sequence (see Figure 11-16) as a double
.
Figure 11-16 cern.jet.stat.Descriptive.sumOfLogarithms(DoubleArrayList data, int from, int to)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr49
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr49"><![CDATA[ select sumOfLogarithms(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 3.6888794541139363 1000: - 3.6888794541139363 1000: + 7.090076835776092 1200: - 7.090076835776092 1200: + 10.085809109330082 2000: - 10.085809109330082 2000: + 12.388394202324129
11.31 sumOfPowerDeviations
Syntax

Purpose
sumOfPowerDeviations
is based on cern.jet.stat.Descriptive.sumOfPowerDeviations(DoubleArrayList data, int k, double c)
. It returns sum of power deviations of a data sequence (see Figure 11-17) as a double
.
Figure 11-17 cern.jet.stat.Descriptive.sumOfPowerDeviations(DoubleArrayList data, int k, double c)

This function is optimized for common parameters like c == 0.0
, k == -2 .. 4
, or both.
This function takes the following tuple arguments:
-
double1
: data value. -
int1
:k
. -
double2
:c
.
For more information, see:
Examples
Consider the query qColtAggr50
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr50"><![CDATA[ select sumOfPowerDeviations(c3, c1, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 10000.0 1200: - 10000.0 1200: + 9000.0 2000: - 9000.0 2000: + 6.818E11
11.32 sumOfPowers
Syntax

Purpose
sumOfPowers
is based on cern.jet.stat.Descriptive.sumOfPowers(DoubleArrayList data, int k)
. It returns the sum of powers of a data sequence (see Figure 11-18) as a double
.
Figure 11-18 cern.jet.stat.Descriptive.sumOfPowers(DoubleArrayList data, int k)

This function takes the following tuple arguments:
-
double1
: data value. -
int1
:k
.
For more information, see:
Examples
Consider the query qColtAggr52
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr52"><![CDATA[ select sumOfPowers(c3, c1) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 3370000.0 1200: - 3370000.0 1200: + 99000.0 2000: - 99000.0 2000: + 7.2354E12
11.33 sumOfSquaredDeviations
Syntax

Purpose
sumOfSquaredDeviations
is based on cern.jet.stat.Descriptive.sumOfSquaredDeviations(int size, double variance)
. It returns the sum of squared mean deviation of a data sequence (see Figure 11-19) as a double
.
Figure 11-19 cern.jet.stat.Descriptive.sumOfSquaredDeviations(int size, double variance)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see
Examples
Consider the query qColtAggr53
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr53"><![CDATA[ select sumOfSquaredDeviations(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 25.0 1200: - 25.0 1200: + 133.33333333333334 2000: - 133.33333333333334 2000: + 375.0
11.34 sumOfSquares
Syntax

Purpose
sumOfSquares
is based on cern.jet.stat.Descriptive.sumOfSquares(DoubleArrayList data)
. It returns the sum of squares of a data sequence (see Figure 11-20) as a double
.
Figure 11-20 cern.jet.stat.Descriptive.sumOfSquares(DoubleArrayList data)

This function takes the following tuple arguments:
-
double1
: data value.
For more information, see:
Examples
Consider the query qColtAggr54
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr54"><![CDATA[ select sumOfSquares(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 1600.0 1000: - 1600.0 1000: + 2500.0 1200: - 2500.0 1200: + 2900.0 2000: - 2900.0 2000: + 3000.0
11.35 trimmedMean
Syntax

Purpose
trimmedMean
is based on cern.jet.stat.Descriptive.trimmedMean(DoubleArrayList sortedData, double mean, int left, int right)
. It returns the trimmed mean of an ascending sorted data sequence as a double
.
This function takes the following tuple arguments:
-
double1
: data value. -
int1
:left
. -
int2
:right
.
For more information, see:
Examples
Consider the query qColtAggr55
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr55"><![CDATA[ select trimmedMean(c3, c1, c1) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 0, 0.5, 40.0, 8 1000 0, 0.7, 30.0, 6 1200 0, 0.89, 20.0, 12 2000 1, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.0 1200: - 35.0 1200: + 30.0 2000: - 30.0 2000: + 25.0
11.36 variance
Syntax

Purpose
variance
is based on cern.jet.stat.Descriptive.variance(int size, double sum, double sumOfSquares)
. It returns the variance of a data sequence (see Figure 11-21) as a double
.
Figure 11-21 cern.jet.stat.Descriptive.variance(int size, double sum, double sumOfSquares)

The following table lists the input types and the corresponding output types:
Input Types |
Output Types |
INT | DOUBLE |
BIGINT | DOUBLE |
FLOAT | DOUBLE |
DOUBLE | DOUBLE |
For more information, see:
Examples
Consider the query qColtAggr57
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr57"><![CDATA[ select variance(c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 0.0 1000: - 0.0 1000: + 25.0 1200: - 25.0 1200: + 66.66666666666667 2000: - 66.66666666666667 2000: + 125.0
11.37 weightedMean
Syntax

Purpose
weightedMean
is based on cern.jet.stat.Descriptive.weightedMean(DoubleArrayList data, DoubleArrayList weights)
. It returns the weighted mean of a data sequence (see Figure 11-22) as a double
.
Figure 11-22 cern.jet.stat.Descriptive.weightedMean(DoubleArrayList data, DoubleArrayList weights)

This function takes the following tuple arguments:
-
double1
: data value. -
double2
: weight value.
For more information, see:
Examples
Consider the query qColtAggr58
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr58"><![CDATA[ select weightedMean(c3, c3) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 4, 0.7, 30.0, 6 1200 3, 0.89, 20.0, 12 2000 8, 0.4, 10.0, 4 h 8000 h 200000000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.714285714285715 1200: - 35.714285714285715 1200: + 32.22222222222222 2000: - 32.22222222222222 2000: + 30.0
11.38 winsorizedMean
Syntax

Purpose
winsorizedMean
is based on cern.jet.stat.Descriptive.winsorizedMean(DoubleArrayList sortedData, double mean, int left, int right)
. It returns the winsorized mean of a sorted data sequence as a double
.
This function takes the following tuple arguments:
-
double1
: data value. -
int1
:left
. -
int2
:right
.
For more information, see:
Examples
Consider the query qColtAggr60
. Given the data stream SColtAggrFunc
with schema (c1 integer, c2 float, c3 double, c4 bigint)
, the query returns the relation.
<query id="qColtAggr60"><![CDATA[ select winsorizedMean(c3, c1, c1) from SColtAggrFunc ]]></query>
Timestamp Tuple 10 1, 0.5, 40.0, 8 1000 0, 0.7, 30.0, 6 1200 1, 0.89, 20.0, 12 2000 1, 0.4, 10.0, 4 h 8000
Timestamp Tuple Kind Tuple -9223372036854775808:+ 10: - 10: + 40.0 1000: - 40.0 1000: + 35.0 1200: - 35.0 1200: + 30.000000000000004 2000: - 30.000000000000004 2000: + 25