Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Batch19 SQL aggregation functions #17658

Merged
merged 13 commits into from
Feb 5, 2025
Next Next commit
Batch19 SQL aggregation functions
writer-jill committed Jan 23, 2025
commit a3f3e4407350f8d0de6de30a3e92233fc0eb7a11
67 changes: 61 additions & 6 deletions docs/querying/sql-functions.md
Original file line number Diff line number Diff line change
@@ -371,19 +371,74 @@ Returns the bitwise exclusive OR between the two expressions, that is, `expr1 ^

## BLOOM_FILTER

`BLOOM_FILTER(expr, <NUMERIC>)`
Computes a [bloom filter](../development/extensions-core/bloom-filter.md) from values provided in an expression.

**Function type:** [Aggregation](sql-aggregations.md)
`numEntries` specifies the maximum number of distinct values before the false positive rate increases.

* **Syntax:** `BLOOM_FILTER(expr, numEntries)`
* **Function type:** SQL aggregation

<details><summary>Example</summary>

The following example returns a base64-encoded bloom filter string for entries in agent_category,
with a maximum of 10 distinct values:

```sql
SELECT
agent_category,
BLOOM_FILTER(agent_category, 10) as bloom
FROM "kttm"
GROUP BY agent_category
```

Returns the following:

| `agent_keys` | `bloom` |
| -- | -- |
| _`empty`_ | `"BAAAAAgAAAAAABAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEABAAAAAA"` |
| `Game console` | `"BAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAQAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgBAAAAAAAAAAAAAAAA"` |
| `Personal computer` | `"BAAAAAgAAAAAAEAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAA"` |
| `Smart TV` | `"BAAAAAgAAAAAAAAAAAAAgAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAA"` |
| `Smartphone` | `"BAAAAAgAAACAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAA"` |
| `Tablet` | `"BAAAAAgAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAIA"` |

</details>

Computes a Bloom filter from values produced by the specified expression.
[Learn more](sql-aggregations.md)

## BLOOM_FILTER_TEST

`BLOOM_FILTER_TEST(expr, <STRING>)`
Returns true if an expression is contained in a base64-encoded [bloom filter](../development/extensions-core/bloom-filter.md) string.

**Function type:** [Scalar, other](sql-scalar.md#other-scalar-functions)
* **Syntax:** `BLOOM_FILTER_TEST(expr, <STRING>)`
* **Function type:** SQL aggregation

<details><summary>Example</summary>

The following example returns `true` for the bloom filter string associated with `agent_filter` entry `Game console`:

```sql
SELECT
agent_category,
BLOOM_FILTER_TEST(agent_category, 'BAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAQAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgBAAAAAAAAAAAAAAAA') as bloom
FROM "kttm"
GROUP BY agent_category
```

Returns the following:

| `agent_keys` | `bloom` |
| -- | -- |
| _`empty`_ | `false` |
| `Game console` | `true` |
| `Personal computer` | `false` |
| `Smart TV` | `false` |
| `Smartphone` | `false` |
| `Tablet` | `false` |

</details>

Returns true if the expression is contained in a Base64-serialized Bloom filter.
[Learn more](sql-aggregations.md)

## BTRIM