ZetaSQL supports the following statistical aggregate functions.
CORR(X1, X2 [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the Pearson coefficient
of correlation of a set of number pairs. For each number pair, the first number
is the dependent variable and the second number is the independent variable.
The return result is between -1
and 1
. A result of 0
indicates no
correlation.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
COVAR_POP(X1, X2 [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the population covariance of
a set of number pairs. The first number is the dependent variable; the second
number is the independent variable. The return result is between -Inf
and
+Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any input pairs that contain one or more NULL values. If there is no input pair without NULL values, this function returns NULL. If there is exactly one input pair without NULL values, this function returns 0.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
COVAR_SAMP(X1, X2 [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the sample covariance of a
set of number pairs. The first number is the dependent variable; the second
number is the independent variable. The return result is between -Inf
and
+Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any input pairs that contain one or more NULL values. If there are fewer than two input pairs without NULL values, this function returns NULL.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
STDDEV_POP([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the population (biased) standard deviation of the values. The return
result is between 0
and +Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.
If this function receives a single non-NULL input, it returns 0
.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.DISTINCT
: Each distinct value ofexpression
is aggregated only once into the result.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
STDDEV_SAMP([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the sample (unbiased) standard deviation of the values. The return
result is between 0
and +Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.DISTINCT
: Each distinct value ofexpression
is aggregated only once into the result.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
STDDEV([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
An alias of STDDEV_SAMP.
VAR_POP([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the population (biased) variance of the values. The return result is
between 0
and +Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.
If this function receives a single non-NULL input, it returns 0
.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.DISTINCT
: Each distinct value ofexpression
is aggregated only once into the result.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
VAR_SAMP([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
Returns the sample (unbiased) variance of the values. The return result is
between 0
and +Inf
.
All numeric types are supported. If the
input is NUMERIC
or BIGNUMERIC
then the internal aggregation is
stable with the final output converted to a DOUBLE
.
Otherwise the input is converted to a DOUBLE
before aggregation, resulting in a potentially unstable result.
This function ignores any NULL inputs. If there are fewer than two non-NULL inputs, this function returns NULL.
Optional Clauses
The clauses are applied in the following order:
OVER
: Specifies a window. See Analytic Functions.DISTINCT
: Each distinct value ofexpression
is aggregated only once into the result.HAVING MAX
orHAVING MIN
: Restricts the set of rows that the function aggregates by a maximum or minimum value. See HAVING MAX and HAVING MIN clause for details.
Return Data Type
DOUBLE
VARIANCE([DISTINCT] expression [HAVING {MAX | MIN} expression2]) [OVER (...)]
Description
An alias of VAR_SAMP.