From 4d3cb21f007694ab7441a9f8a559424432b05b49 Mon Sep 17 00:00:00 2001 From: Ashwini Ahire <124853365+ashwini-ahire7@users.noreply.github.com> Date: Sat, 21 Sep 2024 17:17:40 +0800 Subject: [PATCH] Update state-and-merge-combinators.md fixed description --- .../state-and-merge-combinators.md | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/content/en/altinity-kb-queries-and-syntax/state-and-merge-combinators.md b/content/en/altinity-kb-queries-and-syntax/state-and-merge-combinators.md index 7334ffbed3..9741685144 100644 --- a/content/en/altinity-kb-queries-and-syntax/state-and-merge-combinators.md +++ b/content/en/altinity-kb-queries-and-syntax/state-and-merge-combinators.md @@ -4,7 +4,11 @@ linkTitle: "-State & -Merge combinators" description: > -State & -Merge combinators --- -The ClickHouse® -State combinator doesn't actually store information about -If combinator, so aggregate functions with -If and without have the same serialized data. + +The -State combinator in ClickHouse® does not store additional information about the -If combinator, which means that aggregate functions with and without -If have the same serialized data structure. This can be verified through various examples, as demonstrated below. + +**Example 1**: maxIfState and maxState +In this example, we use the maxIfState and maxState functions on a dataset of numbers, serialize the result, and merge it using the maxMerge function. ```sql $ clickhouse-local --query "SELECT maxIfState(number,number % 2) as x, maxState(number) as y FROM numbers(10) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(max,UInt64), y AggregateFunction(max,UInt64)" --query "SELECT maxMerge(x), maxMerge(y) FROM table" @@ -13,7 +17,11 @@ $ clickhouse-local --query "SELECT maxIfState(number,number % 2) as x, maxState( 9 10 ``` --State combinator have the same serialized data footprint regardless of parameters used in definition of aggregate function. That's true for quantile\* and sequenceMatch/sequenceCount functions. +In both cases, the -State combinator results in identical serialized data footprints, regardless of the conditions in the -If variant. The maxMerge function merges the state without concern for the original -If condition. + +**Example 2**: quantilesTDigestIfState +Here, we use the quantilesTDigestIfState function to demonstrate that functions like quantile-based and sequence matching functions follow the same principle regarding serialized data consistency. + ```sql $ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,number % 2) FROM numbers(1000000) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(quantileTDigestWeighted(0.5),UInt64,UInt8)" --query "SELECT quantileTDigestWeightedMerge(0.4)(x) FROM table" @@ -22,6 +30,12 @@ $ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,numbe $ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,number % 2) FROM numbers(1000000) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(quantilesTDigestWeighted(0.5),UInt64,UInt8)" --query "SELECT quantilesTDigestWeightedMerge(0.4,0.8)(x) FROM table" [400000,800000] +``` + +**Example 3**: Quantile Functions with -Merge +This example shows how the quantileState and quantileMerge functions work together to calculate a specific quantile. + +```sql SELECT quantileMerge(0.9)(x) FROM ( @@ -34,6 +48,9 @@ FROM └───────────────────────┘ ``` +**Example 4**: sequenceMatch and sequenceCount Functions with -Merge +Finally, we demonstrate the behavior of sequenceMatchState and sequenceMatchMerge, as well as sequenceCountState and sequenceCountMerge, in ClickHouse. + ```sql SELECT sequenceMatchMerge('(?2)(?3)')(x) AS `2_3`, @@ -48,6 +65,11 @@ FROM ┌─2_3─┬─1_4─┬─1_2_3─┐ │ 1 │ 1 │ 0 │ └─────┴─────┴───────┘ +``` + +Similarly, sequenceCountState and sequenceCountMerge functions behave consistently when merging states: + +```sql SELECT sequenceCountMerge('(?1)(?2)')(x) AS `2_3`, @@ -64,3 +86,4 @@ FROM │ 3 │ 0 │ 2 │ └─────┴─────┴───────┘ ``` +ClickHouse's -State combinator stores serialized data in a consistent manner, irrespective of conditions used with -If. The same applies to a wide range of functions, including quantile and sequence-based functions. This behavior ensures that functions like maxMerge, quantileMerge, sequenceMatchMerge, and sequenceCountMerge work seamlessly, even across varied inputs.