Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite SQL builders, make substantial project updates. #586

Merged
merged 12 commits into from
Nov 15, 2024
48 changes: 17 additions & 31 deletions docs/api/sql/aggregate-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,70 +2,56 @@

SQL aggregate function expressions.

## AggregateFunction {#aggregate-function}
## AggregateNode {#aggregate-node}

The `AggregateFunction` class represents an aggregate function.
It includes a non-null `aggregate` property indicating an aggregate expression.
Users should not need to instantiate `AggregateFunction` instances directly, but instead should use aggregate function methods such as [`count()`](#count), [`sum()`](#sum), _etc_.

### basis

`AggregateFunction.basis`

The `basis` property indicates an underlying table column that can serve as a selection target for this aggregate operation.


### label

`AggregateFunction.label`

The `label` property provides a descriptive text label.
The `AggregateNode` class represents a SQL AST node for an aggregate function call.
Users should not need to instantiate `AggregateNode` instances directly, but instead should use aggregate function methods such as [`count()`](#count), [`sum()`](#sum), _etc_.

### distinct

`AggregateFunction.distinct()`
`AggregateNode.distinct()`

Returns a new AggregateFunction instance that applies the aggregation over distinct values only.
Returns a new AggregateNode instance that applies the aggregation over distinct values only.

### where

`AggregateFunction.where(filter)`
`AggregateNode.where(filter)`

Returns a new AggregateFunction instance filtered according to a Boolean-valied _filter_ expression.
Returns a new AggregateNode instance filtered according to a Boolean-valied _filter_ expression.

### window

`AggregateFunction.window()`
`AggregateNode.window()`

Returns a windowed version of this aggregate function as a new [WindowFunction](./window-functions#window-function) instance.
Returns a windowed version of this aggregate function as a new [WindowNode](./window-functions#window-node) instance.

### partitionby

`AggregateFunction.partitionby(...expressions)`
`AggregateNode.partitionby(...expressions)`

Provide one or more _expressions_ by which to partition a windowed version of this aggregate function and returns a new [WindowFunction](./window-functions#window-function) instance.
Provide one or more _expressions_ by which to partition a windowed version of this aggregate function and returns a new [WindowNode](./window-functions#window-node) instance.

### orderby

`AggregateFunction.orderby(...expressions)`
`AggregateNode.orderby(...expressions)`

Provide one or more _expressions_ by which to sort a windowed version of this aggregate function and returns a new [WindowFunction](./window-functions#window-function) instance.
Provide one or more _expressions_ by which to sort a windowed version of this aggregate function and returns a new [WindowNodw](./window-functions#window-node) instance.

### rows

`WindowFunction.rows(expression)`
`AggregateNode.rows(expression)`

Provide a window "rows" frame specification as an array or array-valued _expression_ and returns a windowed version of this aggregate function as a new [WindowFunction](./window-functions#window-function) instance.
Provide a window "rows" frame specification as an array or array-valued _expression_ and returns a windowed version of this aggregate function as a new [WindowNode](./window-functions#window-node) instance.
A "rows" window frame is insensitive to peer rows (those that are tied according to the [orderby](#orderby) criteria).
The frame expression should evaluate to a two-element array indicating the number of preceding or following rows.
A zero value (`0`) indicates the current row.
A non-finite value (including `null` and `undefined`) indicates either unbounded preceding row (for the first array entry) or unbounded following rows (for the second array entry).

### range

`WindowFunction.range(expression)`
`AggregateNode.range(expression)`

Provide a window "range" frame specification as an array or array-valued _expression_ and returns a windowed version of this aggregate function as a new [WindowFunction](./window-functions#window-function) instance.
Provide a window "range" frame specification as an array or array-valued _expression_ and returns a windowed version of this aggregate function as a new [WindowNode](./window-functions#window-node) instance.
A "range" window grows to include peer rows (those that are tied according to the [orderby](#orderby) criteria).
The frame expression should evaluate to a two-element array indicating the number of preceding or following rows.
A zero value (`0`) indicates the current row.
Expand Down
72 changes: 11 additions & 61 deletions docs/api/sql/expressions.md
Original file line number Diff line number Diff line change
@@ -1,88 +1,38 @@
# SQL Expressions

SQL expression builders.
SQL expression builders. All SQL expressions are represented in the form of an abstract syntax tree (AST). Helper methods and functions build out this tree.

## column

`column(name)`

Create an expression that references a column by _name_.
Create an expression AST node that references a column by _name_.
Upon string coercion, the column name will be properly quoted.

## literal

`literal(value)`

Create an expression that references a literal _value_.
Create an expression AST node that references a literal _value_.
Upon string coercion, an appropriate SQL value will be produced.
For example, string literals will be properly quoted and JavaScript `Date` objects that match an extact UTC date will be converted to the SQL statement `MAKE_DATE(year, month, day)`.
For example, string literals will be properly quoted and JavaScript `Date` objects that match an extact UTC date will be converted to the SQL Date definitions.
The supported primitive types are: boolean, null, number, string, regexp, and Date (maps to SQL Date or Timestamp depending on the value).

## sql

``sql`...` ``

A template tag for arbitrary SQL expressions.
A template tag for arbitrary SQL expressions that do not require deep analysis.
Creates an expression AST node with only a partially structured form consisting of unstructured text and interpolated values.
Interpolated values may be strings, other SQL expressions (such as [`column` references](#column) or [operators](./operators)), or [`Param`](../core/param) values.


The snippet below creates a dynamic expression that adds a Param value to a column. The resulting expression will track the column dependency and expose an [`addEventListener`](#addeventlistener) method for tracking param changes.
The snippet below creates a dynamic expression that adds a Param value to a column.
Contained column references can be extracted using the `collectColumns` method.
Contained Param values can be extracted using the `collectParams` method.

``` js
const param = Param.value(5);
sql`${column("foo")} + ${param}`
```

SQL expressions may be nested, in which case all nested column dependencies and parameter updates will propagate to the top-level expression.

## agg

``agg`...` ``

A template tag for aggregate SQL expressions.
This method is similar to [`sql`](#sql), but additionally annotates the resulting expression with an `aggregate` property to indicate that it is an aggregate expression.
This is valuable for helping downstream tools provide a cursory query analysis.

## SQLExpression

`new SQLExpression(spans, column, props)`

The `SQLExpression` class provides a structured object format for SQL expressions.
Typically you will not want to create an expression using the class constructor, but instead use more convenient, high-level methods such as those above.

The constructor takes three arguments:

- _parts_: an ordered array of expression components, which may include strings, sub-expressions, or Param values. When "concatenated" together, these parts should form the full expression.
- _columns_: an array column name strings, indicating columns the expression depends on. Note that if a column is provided only via raw strings, that dependency will not be tracked.
- _props_: an optional object of key-value pairs with which to annotate the resulting expression object. For example, a non-null `aggregate` property will indicate an aggregate expression. Different expression generators may include different annotations to track state and simplify downstream analysis.

### columns

`SQLExpression.columns`

The `columns` property returns an array of tracked column dependencies.
The column list is de-duplicated, and includes dynamic dependencies that may be due to Param-valued expression components.

### column

`SQLExpression.column`

A convenience property for accessing the first column in the [`columns`](#columns) array, or undefined if there is no such column.

### annotate

`SQLExpression.annotate(props)`

Annotate this expression instance with additional properties and return this expression instance.

### toString

`SQLExpression.toString()`

Returns a SQL expression string based on the current state of this expression instance.

### addEventListener

`SQLExpression.addEventListener(type, callback)`

If an expression includes any Param values, it will expose this method.
Expression updates are broadcast using the `"value"` event type.
SQL expressions may be nested, in which case all nested column dependencies and parameter updates are still extractable via the collection visitors.
6 changes: 6 additions & 0 deletions docs/api/sql/operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,3 +99,9 @@ Equivalent to `lo <= expression AND expression <= hi`.

Returns an expression testing if the input _expression_ does not lie between the values _lo_ and _hi_, provided as a two-element array.
Equivalent to `NOT(lo <= expression AND expression <= hi)`.

## isIn

`isIn(expression, values)`

Returns an expression testing if the input _expression_ matches any of the entries in the _values_ array. Maps to `expression IN (...values)`.
56 changes: 28 additions & 28 deletions docs/api/sql/queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@ To learn more about the anatomy of a query, take a look at the [DuckDB Select st

## Query

The `Query` and related `SetOperation` classes provide structured representations of SQL queries.
The top-level `Query` class, along with its concrete `SelectQuery` and `SetOperation` subclasses, provide structured representations of SQL queries.
Upon string coercion, these objects produce a complete SQL query string.

The following static methods create a new `Query` and invoke the corresponding method:
The following static methods create a new `SelectQuery` and invoke the corresponding method:

- `Query.select()`: See the [`select`](#select) method below.
- `Query.from()`: See the [`from`](#from) method below.
Expand All @@ -62,20 +62,32 @@ To instead create a query for metadata (column names and types), pass a query to

## clone

`query.clone()`
`Query.clone()`

Return a new query that is a shallow copy of the current instance.

## subqueries

`Query.subqueries`

The `subqueries` getter property returns an array of subquery instances, or an empty array if there are no subqueries. For selection queries, the subqueries may include common table expressions within `WITH` or nested queries within `FROM`. For set operations, the subqueries are the set operation arguments.

## toString

`Query.toString()`

Coerce this query object to a SQL query string.

## select

`query.select(...expressions)`
`SelectQuery.select(...expressions)`

Select columns and return this query instance.
The _expressions_ argument may include column name strings, [`column` references](./expressions#column), and maps from column names to expressions (either as JavaScript `object` values or nested key-value arrays as produced by `Object.entries`).

## from

`query.from(...tables)`
`SelectQuery.from(...tables)`

Indicate the tables to draw records from and return this query instance.
The _tables_ may be table name strings, queries or subquery expressions, and maps from table names to expressions (either as JavaScript `object` values or nested key-value arrays as produced by `Object.entries`).
Expand All @@ -89,13 +101,13 @@ The input _expressions_ should consist of one or more maps (as JavaScript `objec

## distinct

`query.distinct()`
`SelectQuery.distinct(value = true)`

Update the query to require `DISTINCT` values only and return this query instance.
Update the query to require `DISTINCT` values and return this query instance.

## sample

`query.sample(size, method)`
`SelectQuery.sample(size, method)`

Update the query to sample a subset of _rows_ and return this query instance.
If _size_ is a number between 0 and 1, it is interpreted as a percentage of the full dataset to sample.
Expand All @@ -105,69 +117,57 @@ See the [DuckDB Sample documentation](https://duckdb.org/docs/sql/samples) for m

## where

`query.where(...expressions)`
`SelectQuery.where(...expressions)`

Update the query to additionally filter by the provided predicate _expressions_ and return this query instance.
This method is additive: any previously defined filter criteria will still remain.

## groupby

`query.groupby(...expressions)`
`SelectQuery.groupby(...expressions)`

Update the query to additionally group by the provided _expressions_ and return this query instance.
This method is additive: any previously defined group by criteria will still remain.

## having

`query.having(...expressions)`
`SelectQuery.having(...expressions)`

Update the query to additionally filter aggregate results by the provided predicate _expressions_ and return this query instance.
Unlike `where` criteria, which are applied before an aggregation, the `having` criteria are applied to aggregated results.
This method is additive: any previously defined filter criteria will still remain.

## window

`query.window(...expressions)`
`SelectQuery.window(...expressions)`

Update the query with named window frame definitions and return this query instance.
The _expressions_ arguments should be JavaScript `object` values that map from window names to window frame definitions.
This method is additive: any previously defined windows will still remain.

## qualify

`query.qualify(...expressions)`
`SelectQuery.qualify(...expressions)`

Update the query to additionally filter windowed results by the provided predicate _expressions_ and return this query instance.
Use this method instead of `where` to filter the results of window operations.
This method is additive: any previously defined filter criteria will still remain.

## orderby

`query.orderby(...expressions)`
`SelectQuery.orderby(...expressions)`

Update the query to additionally order results by the provided _expressions_ and return this query instance.
This method is additive: any previously defined sort criteria will still remain.

## limit

`query.limit(rows)`
`SelectQuery.limit(rows)`

Update the query to limit results to the specified number of _rows_ and return this query instance.

## offset

`query.offset(rows)`
`SelectQuery.offset(rows)`

Update the query to offset the results by the specified number of _rows_ and return this query instance.

## subqueries

`query.subqueries`

The `subqueries` getter property returns an array of subquery instances, or an empty array if there are no subqueries.

## toString

`query.toString()`

Coerce this query object to a SQL query string.
Loading