Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ibis.udf.analytic.builtin and ibis.udf.table.builtin to allow for more flexibility with engine-specific functions #7767

Open
1 task done
tswast opened this issue Dec 14, 2023 · 2 comments
Labels
feature Features or general enhancements udf Issues related to user-defined functions

Comments

@tswast
Copy link
Collaborator

tswast commented Dec 14, 2023

Is your feature request related to a problem?

In the BigQuery DataFrames library we define several custom operations. These are mostly to expose BigQuery-specific SQL that didn't necessarily make sense to upstream to the BigQuery backend, though I'm sure some of them could find a home. See: https://github.com/googleapis/python-bigquery-dataframes/blob/main/third_party/bigframes_vendored/ibis/backends/bigquery/registry.py and https://github.com/googleapis/python-bigquery-dataframes/tree/main/third_party/bigframes_vendored/ibis/expr/operations

Breaking changes with the operations / expressions API make it difficult for us to keep up with the latest version of Ibis. See the struggle here: googleapis/python-bigquery-dataframes#53

Describe the solution you'd like

A stable way to call custom SQL as part of an ibis expression so that we can have an escape hatch where necessary.

What version of ibis are you running?

6.x

What backend(s) are you using, if any?

BigQuery

Code of Conduct

  • I agree to follow this project's Code of Conduct
@tswast
Copy link
Collaborator Author

tswast commented Dec 15, 2023

Perhaps https://ibis-project.org/reference/scalar-udfs#ibis.expr.operations.udf.scalar.builtin already gets us part of the way there? Besides scalar functions, BigQuery DataFrames also defines:

  • analytic ops
  • aggregation ops (edit: I see we provide ibis.udf.agg.builtin for these already)
  • ops that return tables, coming soon

Edit: For some of these, it'd be nice to treat certain string arguments as column and/or table references, not sure if the existing Ibis UDF API supports defining such an argument type?

@tswast tswast changed the title feat: stable API for defining custom operations, alternatively more hooks to call SQL methods directly feat: ibis.udf.analytic.builtin and ibis.udf.table.builtin to allow for more flexibility with engine-specific functions Dec 15, 2023
@cpcloud
Copy link
Member

cpcloud commented Dec 19, 2023

Thanks for the issue!

I think table UDFs need their own issue and some discussion about the API, especially the bits around getting the schema of the return value of the table UDF into ibis.

In most systems that support table UDFs there's really awful UX around providing this information. It tends be extremely cumbersome to use, because any changes to the implementation require editing the schema. It's a hard problem to be sure, but I think we need to be thoughtful about how we expose table UDFs to end users.

@jcrist jcrist added the udf Issues related to user-defined functions label Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements udf Issues related to user-defined functions
Projects
Status: backlog
Development

No branches or pull requests

3 participants