Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

Closed
wants to merge 3 commits into from

Conversation

scwf
Copy link
Contributor

@scwf scwf commented Nov 7, 2014

select * from src, get the wrong result set as follows:

...
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
...

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23039 has started for PR 3149 at commit f64eddf.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23039 has finished for PR 3149 at commit f64eddf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23039/
Test PASSed.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

To add a regression test for this.

@OopsOutOfMemory
Copy link
Contributor

Hi, @scwf
The query select * from src seems works fine in my jdbc thrifserver testing.
Also,
I don't understand why add a map operation here to return just the same copy of each element in the resultRdd can resolve that problem?
Could u explain the whole process and environment in detail that lead to the incoreect result?
Thanks! :)

@liancheng
Copy link
Contributor

@OopsOutOfMemory The map(_.copy()) makes sense, because HiveTableScan uses single a mutable row object to traverse the underlying table for optimization purposes (reducing object number and GC pressure). If you collect the RDD without copying, all row object reference in a single RDD partition point to the same mutable row object. However, this situation should have been dealt with properly before. I'm tracking how and when this bug was introduced.

And you should be able to reproduce this issue with the current master branch. Note that only SBT build can be used until #3105 and #3103 are merged.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

yes, @OopsOutOfMemory, you can test this with master branch.

@OopsOutOfMemory
Copy link
Contributor

Thanks @liancheng @scwf for explanation. :)
Previous testing I tested is using the release version but not the newest master branch.
Sorry for that, I can reproduce this issue in the current master branch now.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23048 has started for PR 3149 at commit 8b2d845.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23048 has finished for PR 3149 at commit 8b2d845.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class RetryingBlockFetcher

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23048/
Test PASSed.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

Actually, this is caused by marmbrus@85872f6#diff-1 and marmbrus@982c035#diff-5 ( which is in #3063 )
@marmbrus is there a reason you remove override lazy val toRdd there? I think we should keep override lazy val toRdd: RDD[Row] = executedPlan.execute().map(_.copy()) in HiveContext's QueryExecution to avoid this issue.

@liancheng
Copy link
Contributor

In #3063, HiveContext.toRdd was removed (line 377) , and the copy operation was moved to HiveContext.stringResult (line 436). However, the Thrift server relies on HiveContext.toRdd to retrieve result RDD, thus causes this bug.

@marmbrus I'm a bit confused here, could you please elaborate on the reason behind this change? Reverting this change should fix this bug, but I'm not sure whether this breaks any other contracts introduced in #3063.

@liancheng
Copy link
Contributor

@scwf Oh, didn't notice you've already pointed this out :)

@marmbrus
Copy link
Contributor

marmbrus commented Nov 7, 2014

Good catch guys, and thanks for adding a test.

The comment on toRdd has always been /** Internal version of the RDD. Avoids copies and has no schema */ so it was kind of confusing that this was different for Hive.

I think the right solution here is to avoid using the internal queryExecution API from the thrift server and instead just call .collect() on resultRdd.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

@marmbrus, i think you mean .collect() on result, not resultRdd, right?

@marmbrus
Copy link
Contributor

marmbrus commented Nov 7, 2014

Yes, correct.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

Hmm, i think result.collect is ok, but result.toLocalIterator can get the right answer?

@marmbrus
Copy link
Contributor

marmbrus commented Nov 7, 2014

If the .toLocalIterator method of SchemaRDD is returning the wrong answer then that is a bug that should be fixed separately.

@scwf
Copy link
Contributor Author

scwf commented Nov 7, 2014

Yeah, i think it's ok since in ``SchemaRDD` compute method deal with this. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala#L117

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23059 has started for PR 3149 at commit 1574a43.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23059 has finished for PR 3149 at commit 1574a43.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23059/
Test PASSed.

@marmbrus
Copy link
Contributor

marmbrus commented Nov 7, 2014

Thanks! I'm merging this into master and 1.2.

@asfgit asfgit closed this in d6e5552 Nov 7, 2014
asfgit pushed a commit that referenced this pull request Nov 7, 2014
select * from src, get the wrong result set as follows:
```
...
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
...

```

Author: wangfei <wangfei1@huawei.com>

Closes #3149 from scwf/SPARK-4292 and squashes the following commits:

1574a43 [wangfei] using result.collect
8b2d845 [wangfei] adding test
f64eddf [wangfei] result set iter bug

(cherry picked from commit d6e5552)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@scwf scwf deleted the SPARK-4292 branch November 8, 2014 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants