Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(ir): avoid exponential growth on name attribute access #8445

Merged
merged 1 commit into from
Feb 26, 2024

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Feb 25, 2024

Fixes #8432. In short, the pattern we are using here for accessing name results in exponential growth of attribute accesses.

This commit changes the use of `hasattr` + `name` attribute access to
eagerly access the `name` attribute, store it and check for `None`.
Accessing the attribute twice may seem innocuous, but since names may be
generated recursively, this results in 2 ** n growth in the number of
accesses: once for the `hasattr` call, and once for the access if
`hasattr` succeeds. This can happen with large operation trees where
names are automatically generated.
@cpcloud cpcloud added this to the 9.0 milestone Feb 25, 2024
@cpcloud cpcloud added the performance Issues related to ibis's performance label Feb 25, 2024
@cpcloud cpcloud requested a review from kszucs February 25, 2024 14:10
@cpcloud cpcloud added the ci-run-cloud Run BigQuery, Snowflake, Databricks, and Athena backend tests label Feb 25, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Run BigQuery, Snowflake, Databricks, and Athena backend tests label Feb 25, 2024
@kszucs kszucs changed the title perf: avoid exponential growth on name attribute access perf(ir): avoid exponential growth on name attribute access Feb 26, 2024
@@ -85,7 +85,11 @@ def __coerce__(
# TODO(kszucs): figure out how to represent not named arguments
@property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should turn this into @attribute in a follow-up since the .name property is accessed multiple times during both construction and compilation.

@kszucs kszucs merged commit 7667328 into ibis-project:main Feb 26, 2024
82 checks passed
for _ in range(n): # noqa: F402
x = x + 1

assert x.op().name.count("Add") == n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpcloud do we want to add some way of checking for recursion explosion? Maybe use n=200 and then test this doesn't take longer than 3 seconds to compute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't finish in any reasonable amount of time, so just having it as a standard unit test seems fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, I see, so you would just notice it during local dev, no automation needed. Sure that works.

@cpcloud cpcloud deleted the dot-name-perf branch February 27, 2024 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues related to ibis's performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: regression: slow to compute name of deeply nested expressions
3 participants