[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

scwf · 2014-11-07T05:34:21Z

select * from src, get the wrong result set as follows:

...
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 309  | val_309  |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
| 97   | val_97   |
...

SparkQA · 2014-11-07T05:37:39Z

Test build #23039 has started for PR 3149 at commit f64eddf.

This patch merges cleanly.

SparkQA · 2014-11-07T06:45:05Z

Test build #23039 has finished for PR 3149 at commit f64eddf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-07T06:45:08Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23039/
Test PASSed.

scwf · 2014-11-07T07:00:40Z

To add a regression test for this.

OopsOutOfMemory · 2014-11-07T07:08:39Z

Hi, @scwf
The query select * from src seems works fine in my jdbc thrifserver testing.
Also,
I don't understand why add a map operation here to return just the same copy of each element in the resultRdd can resolve that problem?
Could u explain the whole process and environment in detail that lead to the incoreect result?
Thanks! :)

liancheng · 2014-11-07T07:19:30Z

@OopsOutOfMemory The map(_.copy()) makes sense, because HiveTableScan uses single a mutable row object to traverse the underlying table for optimization purposes (reducing object number and GC pressure). If you collect the RDD without copying, all row object reference in a single RDD partition point to the same mutable row object. However, this situation should have been dealt with properly before. I'm tracking how and when this bug was introduced.

And you should be able to reproduce this issue with the current master branch. Note that only SBT build can be used until #3105 and #3103 are merged.

scwf · 2014-11-07T07:30:23Z

yes, @OopsOutOfMemory, you can test this with master branch.

OopsOutOfMemory · 2014-11-07T07:47:50Z

Thanks @liancheng @scwf for explanation. :)
Previous testing I tested is using the release version but not the newest master branch.
Sorry for that, I can reproduce this issue in the current master branch now.

SparkQA · 2014-11-07T08:09:49Z

Test build #23048 has started for PR 3149 at commit 8b2d845.

This patch merges cleanly.

SparkQA · 2014-11-07T09:20:32Z

Test build #23048 has finished for PR 3149 at commit 8b2d845.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- public class RetryingBlockFetcher

AmplabJenkins · 2014-11-07T09:20:35Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23048/
Test PASSed.

scwf · 2014-11-07T09:47:29Z

Actually, this is caused by marmbrus@85872f6#diff-1 and marmbrus@982c035#diff-5 ( which is in #3063 )
@marmbrus is there a reason you remove override lazy val toRdd there? I think we should keep override lazy val toRdd: RDD[Row] = executedPlan.execute().map(_.copy()) in HiveContext's QueryExecution to avoid this issue.

liancheng · 2014-11-07T09:57:38Z

In #3063, HiveContext.toRdd was removed (line 377) , and the copy operation was moved to HiveContext.stringResult (line 436). However, the Thrift server relies on HiveContext.toRdd to retrieve result RDD, thus causes this bug.

@marmbrus I'm a bit confused here, could you please elaborate on the reason behind this change? Reverting this change should fix this bug, but I'm not sure whether this breaks any other contracts introduced in #3063.

liancheng · 2014-11-07T10:02:13Z

@scwf Oh, didn't notice you've already pointed this out :)

marmbrus · 2014-11-07T18:27:04Z

Good catch guys, and thanks for adding a test.

The comment on toRdd has always been /** Internal version of the RDD. Avoids copies and has no schema */ so it was kind of confusing that this was different for Hive.

I think the right solution here is to avoid using the internal queryExecution API from the thrift server and instead just call .collect() on resultRdd.

scwf · 2014-11-07T18:32:24Z

@marmbrus, i think you mean .collect() on result, not resultRdd, right?

marmbrus · 2014-11-07T18:39:29Z

Yes, correct.

scwf · 2014-11-07T18:47:05Z

Hmm, i think result.collect is ok, but result.toLocalIterator can get the right answer?

marmbrus · 2014-11-07T18:59:49Z

If the .toLocalIterator method of SchemaRDD is returning the wrong answer then that is a bug that should be fixed separately.

scwf · 2014-11-07T19:00:12Z

Yeah, i think it's ok since in ``SchemaRDD` compute method deal with this. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala#L117

SparkQA · 2014-11-07T19:15:17Z

Test build #23059 has started for PR 3149 at commit 1574a43.

This patch merges cleanly.

SparkQA · 2014-11-07T20:25:04Z

Test build #23059 has finished for PR 3149 at commit 1574a43.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-07T20:25:07Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23059/
Test PASSed.

marmbrus · 2014-11-07T20:55:57Z

Thanks! I'm merging this into master and 1.2.

select * from src, get the wrong result set as follows: ``` ... | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 309 | val_309 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | | 97 | val_97 | ... ``` Author: wangfei <wangfei1@huawei.com> Closes #3149 from scwf/SPARK-4292 and squashes the following commits: 1574a43 [wangfei] using result.collect 8b2d845 [wangfei] adding test f64eddf [wangfei] result set iter bug (cherry picked from commit d6e5552) Signed-off-by: Michael Armbrust <michael@databricks.com>

result set iter bug

f64eddf

adding test

8b2d845

scwf force-pushed the SPARK-4292 branch from ad7e4a4 to 8b2d845 Compare November 7, 2014 08:05

using result.collect

1574a43

asfgit closed this in d6e5552 Nov 7, 2014

scwf deleted the SPARK-4292 branch November 8, 2014 00:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

scwf commented Nov 7, 2014

OopsOutOfMemory commented Nov 7, 2014

liancheng commented Nov 7, 2014

scwf commented Nov 7, 2014

OopsOutOfMemory commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

scwf commented Nov 7, 2014

liancheng commented Nov 7, 2014

liancheng commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

marmbrus commented Nov 7, 2014

[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

[SPARK-4292][SQL] Result set iterator bug in JDBC/ODBC #3149

Conversation

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

scwf commented Nov 7, 2014

OopsOutOfMemory commented Nov 7, 2014

liancheng commented Nov 7, 2014

scwf commented Nov 7, 2014

OopsOutOfMemory commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

scwf commented Nov 7, 2014

liancheng commented Nov 7, 2014

liancheng commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

AmplabJenkins commented Nov 7, 2014

marmbrus commented Nov 7, 2014