ZetaSQL supports pipe query syntax, which is a simpler and more concise alternative to standard query syntax. Pipe syntax supports many of the same operators as standard syntax, and improves some areas of SQL query functionality.
Pipe syntax has the following key characteristics:
- Pipe syntax consists of a pipe and an angle bracket
|>
, an operator name, and any arguments:
|> operator_name argument_list
- Pipe operators can be added to the end of any valid query.
- Pipe operators can be applied in any order, any number of times.
- Pipe syntax works anywhere standard syntax is supported: in queries, views, table-valued functions (TVFs), and other contexts.
- Pipe syntax can be mixed with standard syntax in the same query. For example, subqueries can use different syntax from the parent query.
- A query can start with a
FROM
clause, and pipe operators can optionally be added after theFROM
clause.
Compare the following equivalent queries that count open tickets assigned to a user:
Standard syntax
SELECT component_id, COUNT(*)
FROM ticketing_system_table
WHERE
assignee_user.email = 'username@email.com'
AND status IN ('NEW', 'ASSIGNED', 'ACCEPTED')
GROUP BY component_id
ORDER BY component_id DESC;
Pipe syntax
FROM ticketing_system_table
|> WHERE
assignee_user.email = 'username@email.com'
AND status IN ('NEW', 'ASSIGNED', 'ACCEPTED')
|> AGGREGATE COUNT(*)
GROUP AND ORDER BY component_id DESC;
Pipe operators have the following semantic behavior:
- Each pipe operator performs a self-contained operation.
- A pipe operator consumes the input table passed to it through the pipe character and produces a new table as output.
- A pipe operator can reference only columns from its immediate input table. Columns from earlier in the same query aren't visible. Inside subqueries, correlated references to outer columns are still allowed.
In pipe syntax, a query can start with a standard FROM
clause
and use any standard FROM
syntax, including tables, joins, subqueries,
UNNEST
operations, and table-valued functions (TVFs). Table aliases can be
assigned to each input item using the AS alias
clause.
A query with only a FROM
clause, like FROM table_name
, is allowed in pipe
syntax and returns all rows from the table. For tables with columns,
FROM table_name
in pipe syntax is similar to
SELECT * FROM table_name
in standard syntax.
For value tables, FROM table_name
in
pipe syntax returns the row values without expanding fields, similar to
SELECT value FROM table_name AS value
in standard
syntax.
Examples
-- Return a table row that matches a condition. FROM table_name |> WHERE value_column IS NULL |> LIMIT 1;
-- Join tables in the FROM clause and then apply pipe operators. FROM Table1 AS t1 JOIN Table2 AS t2 USING (key) |> AGGREGATE SUM(t2.value) GROUP BY t1.key;
ZetaSQL supports the following pipe operators. For operators that correspond or relate to similar operations in standard syntax, the operator descriptions highlight similarities and differences and link to more detailed documentation on the corresponding syntax.
|> SELECT expression [[AS] alias] [, ...]
Description
Produces a new table with the listed columns, similar to the outermost
SELECT
clause in a table subquery in standard syntax.
Supports standard output modifiers like SELECT AS STRUCT
, and supports
window functions. Doesn't support aggregations or
anonymization.
In pipe syntax, the SELECT
operator in a query is optional. The SELECT
operator can be used near the end of a query to specify the list of output
columns. The final query result contains the columns returned from the last pipe
operator. If the SELECT
operator isn't used to select specific columns, the
output includes the full row, similar to what the
SELECT *
statement in standard syntax produces.
For value tables, the result is the
row value, without field expansion.
In pipe syntax, the SELECT
clause doesn't perform aggregation. Use the
AGGREGATE
operator instead.
For cases where SELECT
would be used in standard syntax to rearrange columns,
pipe syntax supports other operators:
- The
EXTEND
operator adds columns. - The
SET
operator updates the value of an existing column. - The
DROP
operator removes columns. - The
RENAME
operator renames columns.
Example
|> SELECT account_id AS Account
|> EXTEND expression [[AS] alias] [, ...]
Description
Propagates the existing table and adds a computed column, similar to
SELECT *, new_column
in standard syntax. Supports
window functions.
Examples
|> EXTEND status IN ('NEW', 'ASSIGNED', 'ACCEPTED') AS is_open
-- Window function, with OVER |> EXTEND SUM(val) OVER (ORDER BY k) AS val_over_k
|> SET column_name = expression [, ...]
Description
Replaces the value of a column in the current table, similar to
SELECT * REPLACE (expression AS column)
in standard syntax.
Each referenced column must exist exactly once in the input table.
After a SET
operation, the referenced top-level columns (like x
) are
updated, but table aliases (like t
) still refer to the original row values.
Therefore, t.x
will still refer to the original value.
Example
|> SET x = 5, y = CAST(y AS INT32)
|> DROP column_name [, ...]
Description
Removes listed columns from the current table, similar to
SELECT * EXCEPT (column)
in standard syntax. Each
referenced column must exist at least once in the input table.
After a DROP
operation, the referenced top-level columns (like x
) are
removed, but table aliases (like t
) still refer to the original row values.
Therefore, t.x
will still refer to the original value.
The DROP
operator doesn't correspond to the
DROP
statement in data definition language (DDL), which
deletes persistent schema objects.
Example
|> DROP account_id, user_id
|> RENAME old_column_name [AS] new_column_name [, ...]
Description
Renames specified columns. Each column to be renamed must exist exactly once in
the input table. The RENAME
operator can't rename value table fields,
pseudo-columns, range variables, or objects that aren't columns in the input
table.
After a RENAME
operation, the referenced top-level columns (like x
) are
renamed, but table aliases (like t
) still refer to the original row
values. Therefore, t.x
will still refer to the original value.
Example
|> RENAME last_name AS surname
|> AS alias
Description
Introduces a table alias for the input table, similar to applying the
AS alias
clause on a table subquery in standard syntax. Any
existing table aliases are removed and the new alias becomes the table alias for
all columns in the row.
The AS
operator can be useful after operators like
SELECT
, EXTEND
, or
AGGREGATE
that add columns but can't give table
aliases to them.
Example
|> SELECT x, y, z |> AS table_alias |> WHERE table_alias.y = 10
|> WHERE boolean_expression
Description
Filters the results of the input table. The WHERE
operator behaves the same
as the WHERE
clause in standard syntax.
In pipe syntax, the WHERE
operator also replaces the
HAVING
clause and QUALIFY
clause in
standard syntax. For example, after performing aggregation with the
AGGREGATE
operator, use the WHERE
operator
instead of the HAVING
clause. For window functions inside
a QUALIFY
clause, use window functions inside a WHERE
clause instead.
Example
|> WHERE assignee_user.email = 'username@email.com'
|> LIMIT count [OFFSET skip_rows]
Description
Limits the number of rows to return in a query, with an optional OFFSET
clause
to skip over rows. The LIMIT
operator behaves the same as the
LIMIT
and OFFSET
clause in standard syntax.
Examples
|> LIMIT 10
|> LIMIT 10 OFFSET 2
-- Full-table aggregation |> AGGREGATE aggregate_expression [[AS] alias] [, ...]
-- Aggregation with grouping |> AGGREGATE [aggregate_expression [[AS] alias] [, ...]] GROUP BY groupable_items [[AS] alias] [, ...]
-- Aggregation with grouping and shorthand ordering syntax |> AGGREGATE [aggregate_expression [order_suffix] [[AS] alias] [, ...]] GROUP [AND ORDER] BY groupable_item [order_suffix] [[AS] alias] [, ...] order_suffix: {ASC | DESC} [{NULLS FIRST | NULLS LAST}]
Description
Performs aggregation on data across grouped rows or an entire table. The
AGGREGATE
operator is similar to a query in standard syntax that contains a
GROUP BY
clause or a SELECT
list with
aggregate functions or both. In pipe syntax, the
GROUP BY
clause is part of the AGGREGATE
operator. Pipe syntax
doesn't support a standalone GROUP BY
operator.
Without the GROUP BY
clause, the AGGREGATE
operator performs full-table
aggregation and produces one output row.
With the GROUP BY
clause, the AGGREGATE
operator performs aggregation with
grouping, producing one row for each set of distinct values for the grouping
expressions.
The AGGREGATE
expression list corresponds to the aggregated expressions in a
SELECT
list in standard syntax. Each expression in the AGGREGATE
list must
include an aggregate function. Aggregate expressions can also include scalar
expressions (for example, sqrt(SUM(x*x))
). Column aliases can be assigned
using the AS
operator. Window functions aren't allowed,
but the EXTEND
operator can be used before the
AGGREGATE
operator to compute window functions.
The GROUP BY
clause in the AGGREGATE
operator corresponds to the GROUP BY
clause in standard syntax. Unlike in standard syntax, aliases can be assigned to
GROUP BY
items. Standard grouping operators like GROUPING SETS
, ROLLUP
,
and CUBE
are supported.
The output columns from the AGGREGATE
operator include all grouping columns
first, followed by all aggregate columns, using their assigned aliases as the
column names.
Unlike in standard syntax, grouping expressions aren't repeated across SELECT
and GROUP BY
clauses. In pipe syntax, the grouping expressions are listed
once, in the GROUP BY
clause, and are automatically included as output columns
for the AGGREGATE
operator.
Because output columns are fully specified by the AGGREGATE
operator, the
SELECT
operator isn't needed after the AGGREGATE
operator unless
you want to produce a list of columns different from the default.
Examples
-- Full-table aggregation |> AGGREGATE COUNT(*) AS row_count, SUM(num_users) AS total_users
-- Aggregation with grouping |> AGGREGATE COUNT(*) AS row_count, SUM(num_users) AS total_users, GROUP BY org_site, date
The following examples compare aggregation in standard syntax and in pipe syntax:
-- Aggregation in standard syntax SELECT id, EXTRACT(MONTH FROM date) AS month, SUM(value) AS total FROM table GROUP BY id, month
-- The same aggregation in pipe syntax FROM table |> AGGREGATE SUM(value) AS total GROUP BY id, EXTRACT(MONTH FROM date) AS month
The AGGREGATE
operator supports a shorthand ordering syntax, which is
equivalent to applying the ORDER BY
operator as part
of the AGGREGATE
operator without repeating the column list:
-- Aggregation with grouping and shorthand ordering syntax |> AGGREGATE [aggregate_expression [order_suffix] [[AS] alias] [, ...]] GROUP [AND ORDER] BY groupable_item [order_suffix] [[AS] alias] [, ...] order_suffix: {ASC | DESC} [{NULLS FIRST | NULLS LAST}]
The GROUP AND ORDER BY
clause is equivalent to an ORDER BY
clause on all
groupable_items
. By default, each groupable_item
is sorted in ascending
order with NULL
values first. Other ordering suffixes like DESC
or NULLS LAST
can be used for other orders.
Without the GROUP AND ORDER BY
clause, the ASC
or DESC
suffixes can be
added on individual columns in the GROUP BY
list or AGGREGATE
list or both.
The NULLS FIRST
and NULLS LAST
suffixes can be used to further modify NULL
sorting.
Adding these suffixes is equivalent to adding an ORDER BY
clause that includes
all of the suffixed columns with the suffixed grouping columns first, matching
the left-to-right output column order.
Examples
-- Order by all grouping columns. |> AGGREGATE COUNT(*) GROUP AND ORDER BY first_name, last_name DESC
The ordering in the previous example is equivalent to using
|> ORDER BY first_name, last_name DESC
.
-- Order by specified grouping and aggregate columns. |> AGGREGATE COUNT(*) DESC GROUP BY first_name, last_name ASC
The ordering in the previous example is equivalent to using
|> ORDER BY last_name ASC, COUNT(*) DESC
.
|> ORDER BY expression [sort_options] [, ...]
Description
Sorts results by a list of expressions. The ORDER BY
operator behaves the same
as the ORDER BY
clause in standard syntax. Suffixes like
ASC
, DESC
, and NULLS LAST
are supported for customizing the ordering for
each expression.
In pipe syntax, the AGGREGATE
operator also
supports shorthand ordering suffixes to
apply ORDER BY
behavior more concisely as part of aggregation.
Example
|> ORDER BY last_name DESC
|> [join_type] JOIN from_item [[AS] alias] [{on_clause | using_clause}]
Description
Joins rows from the input table with rows from a second table provided as an
argument. The JOIN
operator behaves the same as the
JOIN
operation in standard syntax. The input table is the
left side of the join and the JOIN
argument is the right side of the join.
Standard join inputs are supported, including tables, subqueries, UNNEST
operations, and table-valued function (TVF) calls. Standard join modifiers like
LEFT
, INNER
, and CROSS
are allowed before the JOIN
keyword.
An alias can be assigned to the input table on the right side of the join, but
not to the input table on the left side of the join. If an alias on the
input table is needed, perhaps to disambiguate columns in an
ON
expression, then an alias can be added using the
AS
operator before the JOIN
arguments.
Example
|> JOIN ticketing_system_table AS components ON bug_table.component_id = CAST(components.component_id AS int64)
|> CALL table_function (argument [, ...]) [[AS] alias]
Description
Calls a table-valued function (TVF), similar to table function calls in standard syntax.
TVFs in standard syntax can be called in the FROM
clause or in a JOIN
operation. These are both allowed in pipe syntax as well.
In pipe syntax, TVFs that take a table argument can also be called with the
CALL
operator. The first table argument comes from the input table and
must be omitted in the arguments. An optional table alias can be added for the
output table.
Multiple TVFs can be called sequentially without using nested subqueries.
Examples
|> CALL AddSuffix('*') |> CALL AddSuffix2(arg1, arg2, arg3)
The following examples compare a TVF call in standard syntax and in pipe syntax:
-- Call a TVF in standard syntax. FROM tvf( (SELECT * FROM table), arg1, arg2 )
-- Call the same TVF in pipe syntax. SELECT * FROM table |> CALL tvf(arg1, arg2)
|> WINDOW window_expression [[AS] alias] [, ...]
Description
Adds a column with the result of computing the function over some window of
existing rows, similar to calling window functions in a
SELECT
list in standard syntax. Existing rows and columns are unchanged. The
window expression must include a window function with an
OVER
clause.
The EXTEND
operator is recommended for window
functions instead of the WINDOW
operator because it also supports window
expressions and covers the same use cases.
Example
|> WINDOW SUM(val) OVER (ORDER BY k)
|> TABLESAMPLE sample_method (sample_size {PERCENT | ROWS}) [, ...]
Description
Selects a random sample of rows from the input table. The TABLESAMPLE
pipe
operator behaves the same as TABLESAMPLE
operator in
standard syntax.
Example
|> TABLESAMPLE BERNOULLI (0.1 PERCENT)
|> PIVOT (aggregate_expression FOR input_column IN (pivot_column [, ...])) [[AS] alias]
Description
Rotates rows into columns. The PIVOT
pipe operator behaves the same as the
PIVOT
operator in standard syntax.
Example
|> SELECT year, username, num_users |> PIVOT (SUM(num_users) FOR username IN ('Jeff', 'Jeffrey', 'Jeffery'))
|> UNPIVOT (values_column FOR name_column IN (column_to_unpivot [, ...])) [[AS] alias]
Description
Rotates columns into rows. The UNPIVOT
pipe operator behaves the same as the
UNPIVOT
operator in standard syntax.
Example
|> UNPIVOT (count FOR user_location IN (London, Bangalore, Madrid)) |> ORDER BY year, cnt
|> ASSERT expression [, payload_expression [, ...]]
Description
Evaluates an expression over all rows of an input table to verify that the expression is true or raise an assertion error if it's false.
The expression must evaluate to a boolean value. When the expression evaluates
to TRUE
, the input table passes through the ASSERT
operator unchanged. When
the expression evaluates to FALSE
or NULL
, the query fails with an
Assertion failed
error.
One or more optional payload expressions can be provided. If the assertion fails, the payload expression values are computed, converted to strings, and included in the error message, separated by spaces.
If no payload is provided, the error message includes the SQL text of the assertion expression.
The ASSERT
operator has no equivalent operation in standard syntax.
The ASSERT
statement is
a related feature that verifies that a single expression is true.
Example
FROM table |> ASSERT count != 0, "Count is zero for user", userId |> SELECT total / count AS average