Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(bigquery): Syntax error when applying an empty window spec #9872

Open
1 task done
chelsea-lin opened this issue Aug 19, 2024 · 4 comments
Open
1 task done

bug(bigquery): Syntax error when applying an empty window spec #9872

chelsea-lin opened this issue Aug 19, 2024 · 4 comments
Assignees
Labels
bigquery The BigQuery backend bug Incorrect behavior inside of ibis window functions Issues or PRs related to window functions

Comments

@chelsea-lin
Copy link
Contributor

What happened?

In the BigQuery backend, when an empty window spec is applied, the generated SQL contains a syntax error. This leads to query failures and prevents the expected computation.

Step to reproduce

  1. Code:
bq = ibis.bigquery.connect(project_id="bigframes-dev")
t = bq.table("ibis_gbq_testing.functional_alltypes")
ibis.to_sql(t.int_col.quantile(0.5).over(ibis.window()), dialect="bigquery")
  1. Generated SQL (with error)
SELECT
  PERCENTILE_CONT(`t0`.`int_col`, 0.5) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Quantile_int_col_0_5`
FROM `bigframes-dev`.`ibis_gbq_testing`.`functional_alltypes` AS `t0`
  1. Error Message:
    Window framing clause is not allowed for analytic function percentile_cont at [2:46]

Expected Behavior
The generated SQL could omit the window framing clause when an empty window spec is provided.

SELECT
  PERCENTILE_CONT(`t0`.`int_col`, 0.5) OVER () AS `Quantile_int_col_0_5`
FROM `bigframes-dev`.`ibis_gbq_testing`.`functional_alltypes` AS `t0`

What version of ibis are you using?

9.2.0

What backend(s) are you using, if any?

BigQuery

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@chelsea-lin chelsea-lin added the bug Incorrect behavior inside of ibis label Aug 19, 2024
@jcrist
Copy link
Member

jcrist commented Aug 21, 2024

Thanks for opening this. I'm unable to reproduce - ibis doesn't define quantile for bigquery, running your code above results in an OperationNotDefinedError. Are you using a modified version of the bigquery compiler?

I'd missed that bigquery has support for percentile_cont/percentile_disc though - I think we could add support for quantile in certain contexts to make this work if that's the ask. Also note that we're adding support for approx_quantile (asked for for bigframes in #9541) in #9881.

@jcrist jcrist added the bigquery The BigQuery backend label Aug 21, 2024
@gforsyth gforsyth added the window functions Issues or PRs related to window functions label Aug 21, 2024
@chelsea-lin
Copy link
Contributor Author

@jcrist Thanks for the additional context! You're right, I should have mentioned I'm using a modified BigQuery compiler.
Initially, we attempted to use ibis.backends.sql.compilers.base.SQLGlotCompiler.visit_Quantile, but the generated SQL included WITHIN GROUP, which isn't valid in this context: PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY t0.level_0 NULLS LAST) OVER (). The expected SQL would be PERCENTILE_CONT(t0.level_0, 0.5) OVER (). As a workaround, we defined a simpler version:

    def visit_Quantile(self, op, *, arg, quantile, where):
        return sge.PercentileCont(this=arg, expression=quantile)

However, when using an empty window spec with this quantile function, we encounter the same issues described previously. I'm uncertain if this is the intended way to use this function in such a scenario.

@jcrist
Copy link
Member

jcrist commented Aug 21, 2024

Thanks for the context. I think we could add quantile support in bigquery with this, but it'd only be valid in certain contexts. Would that resolve your problem? Adding that natively in ibis would be a saner path forward than explaining what other bits of sqlglot/ibis rewrite rules/visit methods you need to hack to make this work. It's a rickety tower of compilation abstractions, but it works :).

@jcrist jcrist self-assigned this Aug 21, 2024
@chelsea-lin
Copy link
Contributor Author

chelsea-lin commented Aug 30, 2024

Thanks jcrist@! Thanks for your response. I agree, including quantile support is a definite improvement. I hadn't realized how complex it would be to manipulate the sqlglot/rewrite rules. On my end, I'm also considering modifying visit_WindowFunction by setting the spec to None while both start and end as None. However, I'm uncertain if this is the right solution.

    def visit_WindowFunction(self, op, *, how, func, start, end, group_by, order_by):
        if start is None and end is None:
            spec = None
        else:
            if start is None:
                start = {}
            if end is None:
                end = {}

            start_value = start.get("value", "UNBOUNDED")
            start_side = start.get("side", "PRECEDING")
            end_value = end.get("value", "UNBOUNDED")
            end_side = end.get("side", "FOLLOWING")

            if getattr(start_value, "this", None) == "0":
                start_value = "CURRENT ROW"
                start_side = None

            if getattr(end_value, "this", None) == "0":
                end_value = "CURRENT ROW"
                end_side = None

            spec = sge.WindowSpec(
                kind=how.upper(),
                start=start_value,
                start_side=start_side,
                end=end_value,
                end_side=end_side,
                over="OVER",
            )
            spec = self._minimize_spec(op.start, op.end, spec)

        order = sge.Order(expressions=order_by) if order_by else None

        return sge.Window(this=func, partition_by=group_by, order=order, spec=spec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery The BigQuery backend bug Incorrect behavior inside of ibis window functions Issues or PRs related to window functions
Projects
Status: backlog
Development

No branches or pull requests

3 participants