Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: unbind expression #4536

Closed
gforsyth opened this issue Sep 19, 2022 · 2 comments · Fixed by #4844
Closed

feat: unbind expression #4536

gforsyth opened this issue Sep 19, 2022 · 2 comments · Fixed by #4844

Comments

@gforsyth
Copy link
Member

I've run into the situation a few times where I have an existing Ibis expression that I pass to some function and it would work except that I am not using unbound tables and so that operation fails.

This just happened to me with @kszucs 's decompiler PR, but also happens when using Ibis-substrait.

I also think that, outside of those possibly uncommon use-cases, often we have a schema already defined in a backend somewhere and we want the equivalent unbound table -- this isn't terribly hard to do, but it's annoying if you don't notice until you are several steps deep.

So, how about an unbind() for Expressions that returns the same expression but with UnboundTable in place of DatabaseTable

Then con.execute(expr.unbind()) == expr.execute()

Some other thoughts / ideas / edge-cases:
What to do with memtable? Should it return the unbound expression AND the data underlying the memtable?

Should all of the backends return either data or sufficient metadata to reload the table in question?
(That should probably be behind a feature flag, I think, or the execute(expr.unbind()) thing falls down.

cc @saulpw @cpcloud

@gforsyth
Copy link
Member Author

For those backends that have parse_type defined (a bunch of them...?) this should be walking the expression graph and subbing in unbound tables a la:

def unbound_from_bound(table):
    return ibis.table(
        list(zip(table.columns, map(parse_type, table.dtypes))), name=table.alias
    )

@kszucs
Copy link
Member

kszucs commented Sep 19, 2022

I like the idea, also it's kinda a prerequisite to properly serialize a bound expression.

Lowering a DatabaseTable should be straightforward since we know its name and its ibis schema already. Regarding memtables I don't think we can do anything else than leaving them as memtables - they are bound to no backends.

Something like the following could work (haven't tried it):

import ibis.expr.analysis as an

def unbind(op):
    if isinstance(op, ops.DatabaseTable):
        return ops.UnboundTable(name=op.name, schema=op.schema)
    else:
        return op

unbound_node = an.substitute(unbind, bound_node)

gforsyth added a commit to gforsyth/ibis that referenced this issue Nov 15, 2022
Calling `unbind()` on an expression returns an equivalent expression but
with all references to backend-specific tables, e.g. `AlchemyTable`,
`PandasTable`... translated into `UnboundTable`.

Should help with serialization of expressions and ease execution of
expressions across backends.

Resolves ibis-project#4536
gforsyth added a commit to gforsyth/ibis that referenced this issue Nov 15, 2022
Calling `unbind()` on an expression returns an equivalent expression but
with all references to backend-specific tables, e.g. `AlchemyTable`,
`PandasTable`... translated into `UnboundTable`.

Should help with serialization of expressions and ease execution of
expressions across backends.

Resolves ibis-project#4536
cpcloud pushed a commit that referenced this issue Nov 16, 2022
Calling `unbind()` on an expression returns an equivalent expression but
with all references to backend-specific tables, e.g. `AlchemyTable`,
`PandasTable`... translated into `UnboundTable`.

Should help with serialization of expressions and ease execution of
expressions across backends.

Resolves #4536
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants