From 84c73b86cb1fc5ad46922019b0be347d0ea48a23 Mon Sep 17 00:00:00 2001 From: Phil Rzewski Date: Mon, 12 Aug 2024 14:17:00 -0700 Subject: [PATCH 1/2] Add summarize docs for "by" without an aggregate function --- docs/language/operators/summarize.md | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/docs/language/operators/summarize.md b/docs/language/operators/summarize.md index 43e463f80e..664fd28bc5 100644 --- a/docs/language/operators/summarize.md +++ b/docs/language/operators/summarize.md @@ -5,15 +5,21 @@ ### Synopsis ``` -[summarize] [:=] [where ][, [:=] [where ] ...] [by [][:=] ...] +[summarize] [:=] [where ][, [:=] [where ] ...] [by [][:=][, [][:=]] ...] +[summarize] by [][:=][, [][:=] ...] ``` ### Description -The `summarize` operator consumes all of its input, applies an [aggregate function](../aggregates/README.md) +In the first form, the `summarize` operator consumes all of its input, +applies an [aggregate function](../aggregates/README.md) to each input value optionally organized with the group-by keys specified after the `by` keyword, and at the end of input produces one or more aggregations for each unique set of group-by key values. +In the second form, `summarize` consumes all of its input, then outputs each +unique combination of values of the group-by keys specified after the `by` +keyword. + The `summarize` keyword is optional since it is an [implied operator](../dataflow-model.md#implied-operators). @@ -102,3 +108,14 @@ echo '{k:"foo",v:1}{k:"bar",v:2}{k:"foo",v:3}{k:"baz",v:4}' | zq -z 'set:=union( {key:"baz",set:|[4]|} {key:"foo",set:|[3]|} ``` + +Output just the unique key values: +```mdtest-command +echo '{k:"foo",v:1}{k:"bar",v:2}{k:"foo",v:3}{k:"baz",v:4}' | zq -z 'by k' - | sort +``` +=> +```mdtest-output +{k:"bar"} +{k:"baz"} +{k:"foo"} +``` From a20e91ba187b61fbb5da0e2de946e096bf6d2d50 Mon Sep 17 00:00:00 2001 From: Phil Rzewski Date: Tue, 13 Aug 2024 11:18:41 -0700 Subject: [PATCH 2/2] Break out synopsis to more lines --- docs/language/operators/summarize.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/docs/language/operators/summarize.md b/docs/language/operators/summarize.md index 664fd28bc5..1ea278a518 100644 --- a/docs/language/operators/summarize.md +++ b/docs/language/operators/summarize.md @@ -5,18 +5,21 @@ ### Synopsis ``` +[summarize] [:=] +[summarize] [:=] [where ][, [:=] [where ] ...] +[summarize] [:=] [by [][:=][, [][:=]] ...] [summarize] [:=] [where ][, [:=] [where ] ...] [by [][:=][, [][:=]] ...] [summarize] by [][:=][, [][:=] ...] ``` ### Description -In the first form, the `summarize` operator consumes all of its input, -applies an [aggregate function](../aggregates/README.md) -to each input value optionally organized with the group-by keys specified after -the `by` keyword, and at the end of input produces one or more aggregations -for each unique set of group-by key values. +In the first four forms, the `summarize` operator consumes all of its input, +applies an [aggregate function](../aggregates/README.md) to each input value +optionally filtered by a `where` clause and/or organized with the group-by +keys specified after the `by` keyword, and at the end of input produces one +or more aggregations for each unique set of group-by key values. -In the second form, `summarize` consumes all of its input, then outputs each +In the final form, `summarize` consumes all of its input, then outputs each unique combination of values of the group-by keys specified after the `by` keyword.