Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26069][TESTS]Fix flaky test: RpcIntegrationSuite.sendRpcWithStreamFailures #23041

Closed
wants to merge 1 commit into from

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Nov 15, 2018

What changes were proposed in this pull request?

The test failure is because assertErrorAndClosed misses one possible error message: java.nio.channels.ClosedChannelException. This happens when the second uploadStream is called after the channel has been closed. This can be reproduced by adding Thread.sleep(1000) below this line:

client.uploadStream(meta, data, new RpcStreamCallback(stream, res, sem));

This PR fixes the above issue and also improves the test failure messages of assertErrorAndClosed.

How was this patch tested?

Jenkins

@zsxwing
Copy link
Member Author

zsxwing commented Nov 15, 2018

cc @squito

assertEquals(1, errorsNotFound.size());
String err = errorsNotFound.iterator().next();
assertTrue(err.equals("closed") || err.equals("Connection reset"));
assertTrue("Got a non-empty set " + r.getLeft(), r.getLeft().isEmpty());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this check here so that we can see what's the error that causes the test failure.

@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98860 has finished for PR 23041 at commit 6bebcb5.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented Nov 15, 2018

retest this please

@squito
Copy link
Contributor

squito commented Nov 15, 2018

lgtm

@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98884 has finished for PR 23041 at commit 6bebcb5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Nov 15, 2018

failure from another flaky, SPARK-24153

@SparkQA
Copy link

SparkQA commented Nov 16, 2018

Test build #4427 has finished for PR 23041 at commit 6bebcb5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented Nov 16, 2018

Thanks. Merging to master and 2.4.

asfgit pushed a commit that referenced this pull request Nov 16, 2018
…treamFailures

## What changes were proposed in this pull request?

The test failure is because `assertErrorAndClosed` misses one possible error message: `java.nio.channels.ClosedChannelException`. This happens when the second `uploadStream` is called after the channel has been closed. This can be reproduced by adding `Thread.sleep(1000)` below this line: https://github.com/apache/spark/blob/03306a6df39c9fd6cb581401c13c4dfc6bbd632e/common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java#L217

This PR fixes the above issue and also improves the test failure messages of `assertErrorAndClosed`.

## How was this patch tested?

Jenkins

Closes #23041 from zsxwing/SPARK-26069.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
(cherry picked from commit 99cbc51)
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
@asfgit asfgit closed this in 99cbc51 Nov 16, 2018
@zsxwing zsxwing deleted the SPARK-26069 branch November 16, 2018 18:04
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…treamFailures

## What changes were proposed in this pull request?

The test failure is because `assertErrorAndClosed` misses one possible error message: `java.nio.channels.ClosedChannelException`. This happens when the second `uploadStream` is called after the channel has been closed. This can be reproduced by adding `Thread.sleep(1000)` below this line: https://github.com/apache/spark/blob/03306a6df39c9fd6cb581401c13c4dfc6bbd632e/common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java#L217

This PR fixes the above issue and also improves the test failure messages of `assertErrorAndClosed`.

## How was this patch tested?

Jenkins

Closes apache#23041 from zsxwing/SPARK-26069.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
…treamFailures

## What changes were proposed in this pull request?

The test failure is because `assertErrorAndClosed` misses one possible error message: `java.nio.channels.ClosedChannelException`. This happens when the second `uploadStream` is called after the channel has been closed. This can be reproduced by adding `Thread.sleep(1000)` below this line: https://github.com/apache/spark/blob/03306a6df39c9fd6cb581401c13c4dfc6bbd632e/common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java#L217

This PR fixes the above issue and also improves the test failure messages of `assertErrorAndClosed`.

## How was this patch tested?

Jenkins

Closes apache#23041 from zsxwing/SPARK-26069.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
(cherry picked from commit 99cbc51)
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
…treamFailures

## What changes were proposed in this pull request?

The test failure is because `assertErrorAndClosed` misses one possible error message: `java.nio.channels.ClosedChannelException`. This happens when the second `uploadStream` is called after the channel has been closed. This can be reproduced by adding `Thread.sleep(1000)` below this line: https://github.com/apache/spark/blob/03306a6df39c9fd6cb581401c13c4dfc6bbd632e/common/network-common/src/test/java/org/apache/spark/network/RpcIntegrationSuite.java#L217

This PR fixes the above issue and also improves the test failure messages of `assertErrorAndClosed`.

## How was this patch tested?

Jenkins

Closes apache#23041 from zsxwing/SPARK-26069.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
(cherry picked from commit 99cbc51)
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants