Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support scalar user-defined functions #58356

Closed
8 of 13 tasks
jordanlewis opened this issue Dec 29, 2020 · 9 comments
Closed
8 of 13 tasks

sql: support scalar user-defined functions #58356

jordanlewis opened this issue Dec 29, 2020 · 9 comments
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-tools-datadog A-tools-hasura A-tools-postgrest C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Dec 29, 2020

This issue tracks adding basic user-defined scalar SQL function support. A scalar user-defined function is a user-defined function that only uses scalar projections in a SELECT.

For example,

create function f (a int, b int) returns int as 'select a+b' language sql;

There are quite a few modifiers that can be specified on CREATE FUNCTION according to the PostgreSQL documentation: https://www.postgresql.org/docs/13/sql-createfunction.html

Checklist for things that we'll want to make sure of by the time this is finished:

  • CREATE FUNCTION
  • CREATE OR REPLACE FUNCTION
  • DROP FUNCTION
  • UDFs should be present in pg_proc and other such metadata tables
  • UDFs in views should get their dependencies tracked properly
  • cached plans need to notice when UDFs are modified
  • UDFs in computed columns should get their dependencies tracked properly
  • Make sure that user-defined functions with custom types are going to be okay
  • Make sure that ::regproc casts work properly both to and from oid/name. See sql,tree: make ::REGPROC cast efficient #61211.

Additionally, it turns out that to properly do name resolution for user-defined functions with overloads, there is significant extra complexity. We can't simply store each overload in the namespace table with its type parameters as part of the name, because that would require an uncachable range scan on every resolution to a given user-defined function.

So, what we need to do is to implement a 2-level descriptor tree for user-defined functions: the parent contains the name of the UDF, and the children contain the overload types and definition (and privileges, dependencies, etc). This is fairly irritating because leasing a user-defined function now requires pulling in all of the children overloads, and leasing them as well.

We'd like to follow the pattern that was used for leasing tables that contain user-defined types, which have a similar problem. The problem is solved with a special cache of user-defined types that is implemented in the pkg/sql/catalog/hydratedtables package.

Vague todo list:

  • flesh out the func overload descriptor protobuf
  • add logic to create both the parent and child upon CREATE FUNCTION
  • generalize the hydratedtables package to also work for func overload descriptors
  • thread a func overload descriptor cache into the table collection to cache the func descriptors upon retrieval of the parent function descriptor

Jira issue: CRDB-11401

@blathers-crl
Copy link

blathers-crl bot commented Dec 29, 2020

Hi @jordanlewis, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate.

While you're here, please consider adding an A- label to help keep our repository tidy.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Dec 29, 2020
@jordanlewis jordanlewis added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Dec 29, 2020
@rafiss
Copy link
Collaborator

rafiss commented Feb 5, 2021

I was just thinking of how there are many functions in pg_builtins.go that are implemented using the internal executor. This can get expensive (as in #57924) because the internal executor does not have the same descriptor collection as the external context. It would be cool if we could implement builtins using this!

@exalate-issue-sync exalate-issue-sync bot removed A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Jun 7, 2021
@ajwerner ajwerner added A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Aug 13, 2021
@ajwerner
Copy link
Contributor

ajwerner commented Aug 13, 2021

because the internal executor does not have the same descriptor collection as the external context. It would be cool if we could implement builtins using this!

If this is the root cause of some slowness, we should address this. It does not feel very hard. In fact, for subtle reasons, it feels like a correctness violation to not be using the transaction's collection.

@rafiss
Copy link
Collaborator

rafiss commented Aug 27, 2021

@ajwerner I agree. It was discussed in Slack here too: https://cockroachlabs.slack.com/archives/C0168LW5THS/p1611947857055600 The recommendation at that time was the the slowness should be addressed by not using the InternalExecutor at all, so that's why I stopped asking about making the InternalExecutor inherit the Collection.

Filed #69495 . I'd be glad if there's now an appetite to fix that.

@ajwerner
Copy link
Contributor

Coming back to this, I don't see how the internal executor behavior is related to this issue? My assumption has been that during planning time we'd take the definition of the function and we'd splat it inline to the AST and have the optimizer go to town. The idea that the execution of the function could happen off of a separate connExecutor gives me shivers.

@rafiss
Copy link
Collaborator

rafiss commented Aug 28, 2021

I agree, I'd want the UDF to be inlined as well.

I wasn't suggesting to use the internal executor for this. I was saying that implementing some builtins using the internal executor can be slow, so perhaps it could make more sense to implement those builtins using UDFs.

@exalate-issue-sync exalate-issue-sync bot removed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-pgcompat Semantic compatibility with PostgreSQL labels Oct 29, 2021
@rafiss rafiss added A-sql-pgcompat Semantic compatibility with PostgreSQL A-tools-hasura A-tools-postgrest C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Nov 22, 2021
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Nov 22, 2021
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 29, 2022
@mgartner
Copy link
Collaborator

mgartner commented Sep 1, 2022

Scalar UDFs will be available in 22.2, so I'm going to close this issue. Note that not all types of UDFs will be supported in 22.2 - please see other UDF issues to track progress on missing features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-tools-datadog A-tools-hasura A-tools-postgrest C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Archived in project
Development

No branches or pull requests

7 participants