Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-1325. The maven build error for Spark Tools #240

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Mar 26, 2014

This is just a slight variation on #234 and alternative suggestion for SPARK-1325. scala-actors is not necessary. SparkBuild.scala should be updated to reflect the direct dependency on scala-reflect and scala-compiler. And the repl build, which has the same dependencies, should also be consistent between Maven / SBT.

… Update repl dependencies, which are similar, to be consistent between Maven / SBT in this regard too.
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13478/

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13480/

@pwendell
Copy link
Contributor

Thanks I merged this. I cited @witgo as an author in the commit, since his original patch covered a lot of this.

@srowen
Copy link
Member Author

srowen commented Mar 27, 2014

Yes @witgo deserves the credit of course.
I agree there is a version inconsistency here although it's a separate issue. I was actually preparing a different PR to address a few things like that and can include this change in there as well?

@witgo
Copy link
Contributor

witgo commented Mar 27, 2014

Uh, create a different PR is a good idea

@srowen srowen deleted the SPARK-1325 branch April 3, 2014 13:13
jhartlaub referenced this pull request in jhartlaub/spark May 27, 2014
SPARK-917 Improve API links in nav bar
(cherry picked from commit 6494d62)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
This is just a slight variation on apache#234 and alternative suggestion for SPARK-1325. `scala-actors` is not necessary. `SparkBuild.scala` should be updated to reflect the direct dependency on `scala-reflect` and `scala-compiler`. And the `repl` build, which has the same dependencies, should also be consistent between Maven / SBT.

Author: Sean Owen <sowen@cloudera.com>
Author: witgo <witgo@qq.com>

Closes apache#240 from srowen/SPARK-1325 and squashes the following commits:

25bd7db [Sean Owen] Add necessary dependencies scala-reflect and scala-compiler to tools. Update repl dependencies, which are similar, to be consistent between Maven / SBT in this regard too.
liancheng pushed a commit to liancheng/spark that referenced this pull request Mar 17, 2017
…DatabricksSQLConf

## What changes were proposed in this pull request?

There were a number of hard-coded config params used for the _Directory Atomic Commit_ protocol implementation. We're hereby moving them to the newly created `DatabricksSLQConf` file, for better vizibility and long-term maintainability.

## How was this patch tested?

`testOnly *DatabricksAtomicCommitProtocolSuite`

Author: Adrian Ionescu <adrian@databricks.com>

Closes apache#240 from adrian-ionescu/SC-5834.
mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 12, 2017
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Add a task to build/push images
cloud-fan added a commit that referenced this pull request Jul 20, 2023
…n't be optimized

### What changes were proposed in this pull request?

Eg:
```scala
sql("create view t(c1, c2) as values (0, 1), (0, 2), (1, 2)")

sql("select c1, c2, (select count(*) cnt from t t2 where t1.c1 = t2.c1 " +
"having cnt = 0) from t t1").show()
```
The error will throw:
```
[PLAN_VALIDATION_FAILED_RULE_IN_BATCH] Rule org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery in batch Operator Optimization before Inferring Filters generated an invalid plan: The plan becomes unresolved: 'Project [toprettystring(c1#224, Some(America/Los_Angeles)) AS toprettystring(c1)#238, toprettystring(c2#225, Some(America/Los_Angeles)) AS toprettystring(c2)#239, toprettystring(cnt#246L, Some(America/Los_Angeles)) AS toprettystring(scalarsubquery(c1))#240]
+- 'Project [c1#224, c2#225, CASE WHEN isnull(alwaysTrue#245) THEN 0 WHEN NOT (cnt#222L = 0) THEN null ELSE cnt#222L END AS cnt#246L]
   +- 'Join LeftOuter, (c1#224 = c1#224#244)
      :- Project [col1#226 AS c1#224, col2#227 AS c2#225]
      :  +- LocalRelation [col1#226, col2#227]
      +- Project [cnt#222L, c1#224#244, cnt#222L, c1#224, true AS alwaysTrue#245]
         +- Project [cnt#222L, c1#224 AS c1#224#244, cnt#222L, c1#224]
            +- Aggregate [c1#224], [count(1) AS cnt#222L, c1#224]
               +- Project [col1#228 AS c1#224]
                  +- LocalRelation [col1#228, col2#229]The previous plan: Project [toprettystring(c1#224, Some(America/Los_Angeles)) AS toprettystring(c1)#238, toprettystring(c2#225, Some(America/Los_Angeles)) AS toprettystring(c2)#239, toprettystring(scalar-subquery#223 [c1#224 && (c1#224 = c1#224#244)], Some(America/Los_Angeles)) AS toprettystring(scalarsubquery(c1))#240]
:  +- Project [cnt#222L, c1#224 AS c1#224#244]
:     +- Filter (cnt#222L = 0)
:        +- Aggregate [c1#224], [count(1) AS cnt#222L, c1#224]
:           +- Project [col1#228 AS c1#224]
:              +- LocalRelation [col1#228, col2#229]
+- Project [col1#226 AS c1#224, col2#227 AS c2#225]
   +- LocalRelation [col1#226, col2#227]
```

The reason of error is the unresolved expression in `Join` node which generate by subquery decorrelation. The `duplicateResolved` in `Join` node are false. That's meaning the `Join` left and right have same `Attribute`, in this eg is `c1#224`. The right `c1#224` `Attribute` generated by having Inputs, because there are wrong having Inputs.

This problem only occurs when there contain having clause.

also do some code format fix.

### Why are the changes needed?
Fix subquery bug on single table when use having clause

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new test

Closes #41347 from Hisoka-X/SPARK-43838_subquery_having.

Lead-authored-by: Jia Fan <fanjiaeminem@qq.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
ragnarok56 pushed a commit to ragnarok56/spark that referenced this pull request Mar 2, 2024
…n't be optimized

### What changes were proposed in this pull request?

Eg:
```scala
sql("create view t(c1, c2) as values (0, 1), (0, 2), (1, 2)")

sql("select c1, c2, (select count(*) cnt from t t2 where t1.c1 = t2.c1 " +
"having cnt = 0) from t t1").show()
```
The error will throw:
```
[PLAN_VALIDATION_FAILED_RULE_IN_BATCH] Rule org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery in batch Operator Optimization before Inferring Filters generated an invalid plan: The plan becomes unresolved: 'Project [toprettystring(c1#224, Some(America/Los_Angeles)) AS toprettystring(c1)apache#238, toprettystring(c2#225, Some(America/Los_Angeles)) AS toprettystring(c2)apache#239, toprettystring(cnt#246L, Some(America/Los_Angeles)) AS toprettystring(scalarsubquery(c1))apache#240]
+- 'Project [c1#224, c2#225, CASE WHEN isnull(alwaysTrue#245) THEN 0 WHEN NOT (cnt#222L = 0) THEN null ELSE cnt#222L END AS cnt#246L]
   +- 'Join LeftOuter, (c1#224 = c1#224#244)
      :- Project [col1#226 AS c1#224, col2#227 AS c2#225]
      :  +- LocalRelation [col1#226, col2#227]
      +- Project [cnt#222L, c1#224#244, cnt#222L, c1#224, true AS alwaysTrue#245]
         +- Project [cnt#222L, c1#224 AS c1#224#244, cnt#222L, c1#224]
            +- Aggregate [c1#224], [count(1) AS cnt#222L, c1#224]
               +- Project [col1#228 AS c1#224]
                  +- LocalRelation [col1#228, col2#229]The previous plan: Project [toprettystring(c1#224, Some(America/Los_Angeles)) AS toprettystring(c1)apache#238, toprettystring(c2#225, Some(America/Los_Angeles)) AS toprettystring(c2)apache#239, toprettystring(scalar-subquery#223 [c1#224 && (c1#224 = c1#224#244)], Some(America/Los_Angeles)) AS toprettystring(scalarsubquery(c1))apache#240]
:  +- Project [cnt#222L, c1#224 AS c1#224#244]
:     +- Filter (cnt#222L = 0)
:        +- Aggregate [c1#224], [count(1) AS cnt#222L, c1#224]
:           +- Project [col1#228 AS c1#224]
:              +- LocalRelation [col1#228, col2#229]
+- Project [col1#226 AS c1#224, col2#227 AS c2#225]
   +- LocalRelation [col1#226, col2#227]
```

The reason of error is the unresolved expression in `Join` node which generate by subquery decorrelation. The `duplicateResolved` in `Join` node are false. That's meaning the `Join` left and right have same `Attribute`, in this eg is `c1#224`. The right `c1#224` `Attribute` generated by having Inputs, because there are wrong having Inputs.

This problem only occurs when there contain having clause.

also do some code format fix.

### Why are the changes needed?
Fix subquery bug on single table when use having clause

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new test

Closes apache#41347 from Hisoka-X/SPARK-43838_subquery_having.

Lead-authored-by: Jia Fan <fanjiaeminem@qq.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants