# Aggregate Functions

# min

Aggregate function that calculates the minimum across a group of values.

Example

SELECT min(salary) FROM employees;
SELECT department, min(salary) FROM employees GROUP BY department;

# max

Aggregate function that calculates the maximum across a group of values.

Example

SELECT max(salary) FROM employees;
SELECT department, max(salary) FROM employees GROUP BY department;

# sum

Calculates the sum. Only works for numbers.

Example:

SELECT sum(salary) FROM employees;

# avg

Calculates the arithmetic mean.

avg(x)

Arguments

x — input values, must be Integer, Float, or Decimal.

Returned value

  • The arithmetic mean, always as Float64.
  • NaN if the input parameter x is empty.

Example

SELECT avg(x) FROM values('x Int8', 0, 1, 2, 3, 4, 5);
┌─avg(x)─┐
│ 2.5    │
└────────┘
CREATE table test (t UInt8) ENGINE = Memory;
SELECT avg(t) FROM test;
┌─avg(x)─┐
│ nan    │
└────────┘

# quantile

Computes an approximate quantile of a numeric data sequence. This function applies reservoir sampling with a reservoir size up to 8192 and a random number generator for sampling. The result is non-deterministic. To get an exact quantile, use the quantileExact function.

When using multiple quantile* functions with different levels in a query, the internal states are not combined (that is, the query works less efficiently than it could). In this case, use the quantiles function.

Note that for an empty numeric sequence, quantile will return NaN, but its quantile* variants will return either NaN or a default value for the sequence type, depending on the variant.

quantile(level)(expr)

Alias:

median.

Arguments

  • level — Level of quantile. Optional parameter. Constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99]. Default value: 0.5. At level=0.5 the function calculates median.
  • expr — Expression over the column values resulting in numeric data types, Date or DateTime.

Returned value

  • Approximate quantile of the specified level.
  • Type:
    • Float64 for numeric data type input.
    • Date if input values have the Date type.
    • DateTime if input values have the DateTime type.

Example

CREATE table test (t UInt8) ENGINE = Memory;
INSERT INTO test values(1)(1)(2)(3)
SELECT quantile(t) FROM test
┌─quantile(t)───┐
│ 1.5           │
└───────────────┘

# entropy

  • Calculates Shannon entropy of a column of values.
  • Syntax
    • entropy(val)
  • Arguments
    • val — Column of values of any type.
  • Returned value
    • Shannon entropy.
    • Type: Float64.
  • Example
    • Query:
    CREATE TABLE entropy (`vals` UInt32,`strings` String) ENGINE = Memory;
    INSERT INTO entropy VALUES (1, 'A'), (1, 'A'), (1,'A'), (1,'A'), (2,'B'), (2,'B'), (2,'C'), (2,'D');
    SELECT entropy(vals), entropy(strings) FROM entropy;
    
    • Result:
    ┌─entropy(vals)─┬─entropy(strings)─┐│ 11.75 │└───────────────┴──────────────────┘