Aggregations

Aggregations

Aggregations summarize all values in a column into a single result. They are used in nodes such as Calculated Column, Reduce table, Aggregate Rows, Filter Groups, and several measurement nodes.

You can either select aggregation statistics directly within the node, or some of the nodes provide a simplified JavaScript calculator. These allow you to work with column aggregations and combine them into more complex expressions. The expression language uses the form aggregation_function(colColumnTitle), for example mean(colObjectArea) calculates mean of values of the column titled Object Area. Note that column names must be prefixed with the col prefix. Calculator nodes typically include buttons that help you define aggregation functions correctly.

If the records in the table are grouped, the statistics are typically calculated for each group separately.

Suffix .P means the population version of the statistic. Suffix .S means the sample version.

Count

Total

Total number of values, counts all rows in the column, including null values. Calculator function is total(colColumnTitle).

Count

Non-null count, counts only rows with a non-null value. Calculator function is count(colColumnTitle).

Distinct

Distinct count, counts the number of unique values in the column and ignores null values. Calculator function is distinct(colColumnTitle).

Basic

Mean

Arithmetic mean, returns the average value. Ignores null values. Calculator function is mean(colColumnTitle).

Median

Median, returns the middle value after sorting. Ignores null values. Calculator function is median(colColumnTitle).

Mode

Mode, returns the most frequent value. Ignores null values.

Min

Minimum, returns the smallest value. Ignores null values. Calculator function is min(colColumnTitle).

Max

Maximum, returns the largest value. Ignores null values. Calculator function is max(colColumnTitle).

Sum

Sum, adds all values together. Ignores null values. Calculator function is sum(colColumnTitle).

First & last

First Valid

First non-null value, returns the first value that is not null and skips null values until it finds a valid one. Calculator function is coalesce(colColumnTitle).

First

First value, returns the first value in the column. Calculator function is first(colColumnTitle).

Last

Last value, returns the last value in the column. Calculator function is last(colColumnTitle).

Last Minus First

Difference between the first and the last value, subtracts the first value from the last value. Calculator expression is (last(colColumnTitle) - first(colColumnTitle)).

Variation and Error

StDev.P

Population standard deviation, measures spread using the population formula. Ignores null values. Calculator function is stdP(colColumnTitle).

StDev.S

Sample standard deviation, measures spread using the sample formula. Ignores null values. Calculator function is stdS(colColumnTitle).

Var.P

Population variance, returns the population variance. Ignores null values. Calculator function is varP(colColumnTitle).

Var.S

Sample variance, returns the sample variance. Ignores null values. Calculator function is varS(colColumnTitle).

VarCoef.P

Population coefficient of variation, returns standard deviation divided by mean using the population formula. Ignores null values. Calculator function is varcoefP(colColumnTitle).

VarCoef.S

Sample coefficient of variation, returns standard deviation divided by mean using the sample formula. Ignores null values. Calculator function is varcoefS(colColumnTitle).

StErr.P

Population standard error, returns the standard error using the population formula. Ignores null values. Calculator function is sterrP(colColumnTitle).

StErr.S

Sample standard error, returns the standard error using the sample formula. Ignores null values. Calculator function is sterrS(colColumnTitle).

RootMeanSquare.P

Population root mean square, returns the root mean square using the population formula. Ignores null values. Calculator function is rmsP(colColumnTitle).

RootMeanSquare.S

Sample root mean square, returns the root mean square using the sample formula. Ignores null values. Calculator function is rmsS(colColumnTitle).

Distribution shape and special

Skewness.P

Population skewness, measures asymmetry of the value distribution using the population formula. Ignores null values. Calculator function is skewP(colColumnTitle).

Skewness.S

Sample skewness, measures asymmetry of the value distribution using the sample formula. Ignores null values. Calculator function is skewS(colColumnTitle).

Kurtosis.P

Population kurtosis, measures tail heaviness or peakedness using the population formula. Ignores null values. Calculator function is kurtP(colColumnTitle).

Kurtosis.S

Sample kurtosis, measures tail heaviness or peakedness using the sample formula. Ignores null values. Calculator function is kurtS(colColumnTitle).

Entropy

Entropy, measures how spread or unpredictable the values are. Ignores null values. Calculator function is entropy(colColumnTitle).

Here, is the normalized frequency of bin .

Uniformity of Histogram

Histogram uniformity, measures how evenly the values are distributed. Ignores null values. Calculator function is histouniformity(colColumnTitle).

Here, is the normalized frequency of bin .

Array

Array of values, returns all values as an array-like result.

Quantiles

Quartile Q1

First quartile, returns the 25th percentile, the value below which 25% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.25).

Quartile Q3

Third quartile, returns the 75th percentile, the value below which 75% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.75).

Percentile P01

1st percentile, returns the value below which 1% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.01).

Percentile P05

5th percentile, returns the value below which 5% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.05).

Percentile P10

10th percentile, returns the value below which 10% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.10).

Percentile P20

20th percentile, returns the value below which 20% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.20).

Percentile P30

30th percentile, returns the value below which 30% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.30).

Percentile P40

40th percentile, returns the value below which 40% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.40).

Percentile P60

60th percentile, returns the value below which 60% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.60).

Percentile P70

70th percentile, returns the value below which 70% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.70).

Percentile P80

80th percentile, returns the value below which 80% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.80).

Percentile P90

90th percentile, returns the value below which 90% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.90).

Percentile P95

95th percentile, returns the value below which 95% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.95).

Percentile P99

99th percentile, returns the value below which 99% of values fall. Ignores null values. Calculator function is quantile(colColumnTitle, 0.99).