Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48719][SQL][3.5] Fix the calculation bug of RegrSlope & RegrIntercept when the first parameter is null #47230

Closed
wants to merge 1 commit into from

Conversation

wayneguow
Copy link
Contributor

What changes were proposed in this pull request?

This PR aims to fix the calculation bug of RegrSlope&RegrIntercept when the first parameter is null. Regardless of whether the first parameter(y) or the second parameter(x) is null, this tuple should be filtered out.

Why are the changes needed?

Fix bug.

Does this PR introduce any user-facing change?

Yes, the calculation changes when the first value of a tuple is null, but the value is truly correct.

How was this patch tested?

Pass GA and test with build/sbt "~sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z linear-regression.sql"

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jul 5, 2024
@wayneguow
Copy link
Contributor Author

wayneguow commented Jul 5, 2024

cc @cloud-fan, all checks have passed

@cloud-fan
Copy link
Contributor

thanks, merging to 3.5!

cloud-fan pushed a commit that referenced this pull request Jul 8, 2024
…tercept when the first parameter is null

### What changes were proposed in this pull request?

This PR aims to fix the calculation bug of RegrSlope&RegrIntercept when the first parameter is null. Regardless of whether the first parameter(y) or the second parameter(x) is null, this tuple should be filtered out.

### Why are the changes needed?

Fix bug.

### Does this PR introduce _any_ user-facing change?

Yes, the calculation changes when the first value of a tuple is null, but the value is truly correct.

### How was this patch tested?

Pass GA and test with build/sbt "~sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z linear-regression.sql"

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47230 from wayneguow/SPARK-48719_3_5.

Authored-by: Wei Guo <guow93@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan cloud-fan closed this Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants