Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13838][SQL] Clear variable code to prevent it to be re-evaluated in BoundAttribute #11674

Closed
wants to merge 1 commit into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Mar 12, 2016

JIRA: https://issues.apache.org/jira/browse/SPARK-13838

What changes were proposed in this pull request?

We should also clear the variable code in BoundReference.genCode to prevent it to be evaluated twice, as we did in evaluateVariables.

How was this patch tested?

Existing tests.

@viirya
Copy link
Member Author

viirya commented Mar 12, 2016

cc @davies

@SparkQA
Copy link

SparkQA commented Mar 12, 2016

Test build #52995 has finished for PR 11674 at commit 0068ff8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya viirya changed the title [SPARK-XXX][SQL] Clear variable code to prevent it to be re-evaluated in BoundAttribute [SPARK-13838][SQL] Clear variable code to prevent it to be re-evaluated in BoundAttribute Mar 13, 2016
@viirya
Copy link
Member Author

viirya commented Mar 17, 2016

cc @davies This is tiny. Do you think this is useful?

@davies
Copy link
Contributor

davies commented Mar 17, 2016

LGTM, merging into master.

@asfgit asfgit closed this in 5f3bda6 Mar 17, 2016
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…ted in BoundAttribute

JIRA: https://issues.apache.org/jira/browse/SPARK-13838
## What changes were proposed in this pull request?

We should also clear the variable code in `BoundReference.genCode` to prevent it  to be evaluated twice, as we did in `evaluateVariables`.

## How was this patch tested?

Existing tests.

Author: Liang-Chi Hsieh <simonh@tw.ibm.com>

Closes apache#11674 from viirya/avoid-reevaluate.
@cloud-fan
Copy link
Contributor

who will re-evaluate ctx.currentVars?

@viirya
Copy link
Member Author

viirya commented Nov 16, 2017

If one variable is used as input to many expressions?

@cloud-fan
Copy link
Contributor

for example, spark.range(10).select('id + 1 as 'i).filter('i + 'i < 4). When the filter opetator consumes input, it already pre-evalute i, and 'id + 1 is only evaluated once, IIUC.

@viirya
Copy link
Member Author

viirya commented Nov 17, 2017

It is correct that we should always evaluate the used variables before generating expression codes. The variables' codes are clear and won't be evaluated twice.

Here this is a safety guard that prevents possible missing, I think.

@cloud-fan
Copy link
Contributor

It's different from evaluateRequiredVariables, evaluateRequiredVariables pulls out the code to be evaluated and put it in the beginning of the generated code. However here we just clear the code, which looks unsafe. If we can't come up with a real case, shall we revert this?

@viirya
Copy link
Member Author

viirya commented Nov 17, 2017

Ok. I think it should be safe to revert this.

@cloud-fan
Copy link
Contributor

Thanks! Since it's a small change, we can do it in future whole-stage-codegen-related PRs.

@viirya
Copy link
Member Author

viirya commented Nov 17, 2017

Oh. I see. After looking at the source file at that time:

https://github.com/viirya/spark-1/blob/0068ff81bf9e90194ba9dc5631ac85683b9606f2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala#L68-L70

The genCode method directly returns the code of the used currentVars. I guess the returned code might be pasted into generated code previously. So it is better to clear the code of the variable.

After iterations of revamping, this is not anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants