[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

HyukjinKwon · 2018-12-05T12:21:29Z

What changes were proposed in this pull request?

Looks this test is flaky

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console

======================================================================
FAIL: test_training_and_prediction (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests)
Test that the model improves on toy data with no. of batches
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 367, in test_training_and_prediction
    self._eventually(condition)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 78, in _eventually
    % (timeout, lastValue))
AssertionError: Test failed due to timeout after 30 sec, with last condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74

----------------------------------------------------------------------
Ran 13 tests in 185.051s

FAILED (failures=1, skipped=1)

This looks happening after increasing the parallelism in Jenkins to speed up at #23111. I am able to reproduce this manually when the resource usage is heavy (with manual decrease of timeout).

How was this patch tested?

Manually tested by

cd python
./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' --python-executables=python

…aining_and_prediction test

HyukjinKwon · 2018-12-05T12:21:47Z

cc @BryanCutler and @viirya

HyukjinKwon · 2018-12-05T12:21:59Z

retest this please

HyukjinKwon · 2018-12-05T12:22:29Z

(cc @squito as well since it's a bit related to #23111)

SparkQA · 2018-12-05T12:49:14Z

Test build #99721 has finished for PR 23236 at commit 3c4ee75.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-05T13:16:22Z

Test build #99724 has finished for PR 23236 at commit 3c4ee75.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-05T13:25:48Z

Test build #99726 has finished for PR 23236 at commit 3c4ee75.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-05T13:29:00Z

Test build #99727 has finished for PR 23236 at commit 3c4ee75.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-12-05T13:36:01Z

Test build #99729 has finished for PR 23236 at commit 3c4ee75.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2018-12-05T18:30:18Z

lgtm

BryanCutler · 2018-12-05T19:39:23Z

Seems ok to me, but there are a few silly things with this test that might help also

why is the stepSize so low at 0.01? I think it would be fine at 0.1, but even conservatively at 0.03 would still help converge a little faster
why is it comparing the current error with the second error value (errors[1] - errors[-1])? Maybe because errors[1] is bigger, but should just be errors[0] and make the values more sensible, like maybe check that error is > 0.2
having 2 if statements to check basically the same condition is a bit redundant, just assert(len(errors) < len(predict_batches) before the last return
nit: true_predicted = [] on line 350 isn't used

If you don't feel comfortable changing any of these, just increasing the timeout is probably fine

HyukjinKwon · 2018-12-06T00:22:59Z

Thanks for a close look, @BryanCutler. I just increased the timeout because I was just trying to keep the test as is and fix it. The actual test doesn't look taking so much time but it looks it can be flaky when the resource usage is heavy. If you feel this can be resolved in other ways, would you mind if I ask to take a look for this please? and i'm happy to close this one. Actually, I am not quite familiar with ML ones ..

BryanCutler · 2018-12-06T00:56:00Z

True, the test is not that long under light resources. Locally, I saw a couple seconds difference with the changes I mentioned. The weird thing is the unmodified test completes after the 11th batch with errors

['0.67', '0.71', '0.78', '0.7', '0.75', '0.74', '0.73', '0.69', '0.62', '0.71', '0.31']

Compared to the error values from the test failures above, they match up until the 10th batch but then these continue until the 16th where it has a timeout

0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74

I would expect the seed to produce the same values (or all diff if the random func is different), which makes me think something else is going on..

Anyway, I think it's fine to increase the timeout and if it's still flaky, we can look at making changes.

viirya · 2018-12-06T01:10:11Z

I agree with @BryanCutler's analysis and it looks a bit weird at few things about this test. I also think it is fine to increase the timeout for now.

HyukjinKwon · 2018-12-06T01:13:04Z

Thanks guys .. I will try to take a look for it as well (although it's going to take relatively a long while). Let me get this in for now anyway.

HyukjinKwon · 2018-12-06T01:14:55Z

Merged to master.

viirya · 2018-12-06T01:53:54Z

Compared to the error values from the test failures above, they match up until the 10th batch but then these continue until the 16th where it has a timeout

I suspect that might because as the resource usage is heavy, StreamingLogisticRegressionWithSGD's training speed on input batch stream can't always catch up predict batch stream. So the model doesn't reach expected improvement in error yet.

BryanCutler · 2018-12-06T18:29:05Z

I suspect that might because as the resource usage is heavy, StreamingLogisticRegressionWithSGD's training speed on input batch stream can't always catch up predict batch stream. So the model doesn't reach expected improvement in error yet.

Yeah that sounds likely, and the timeout increase should help with that

…essionWithSGDTests.test_training_and_prediction test ## What changes were proposed in this pull request? Looks this test is flaky https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console ``` ====================================================================== FAIL: test_training_and_prediction (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests) Test that the model improves on toy data with no. of batches ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 367, in test_training_and_prediction self._eventually(condition) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py", line 78, in _eventually % (timeout, lastValue)) AssertionError: Test failed due to timeout after 30 sec, with last condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74 ---------------------------------------------------------------------- Ran 13 tests in 185.051s FAILED (failures=1, skipped=1) ``` This looks happening after increasing the parallelism in Jenkins to speed up at apache#23111. I am able to reproduce this manually when the resource usage is heavy (with manual decrease of timeout). ## How was this patch tested? Manually tested by ``` cd python ./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' --python-executables=python ``` Closes apache#23236 from HyukjinKwon/SPARK-26275. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

Increases timeout for StreamingLogisticRegressionWithSGDTests.test_tr…

3c4ee75

…aining_and_prediction test

asfgit closed this in ab76900 Dec 6, 2018

HyukjinKwon deleted the SPARK-26275 branch March 3, 2020 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

HyukjinKwon commented Dec 5, 2018

HyukjinKwon commented Dec 5, 2018

HyukjinKwon commented Dec 5, 2018

HyukjinKwon commented Dec 5, 2018 •

edited

Loading

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

squito commented Dec 5, 2018

BryanCutler commented Dec 5, 2018

HyukjinKwon commented Dec 6, 2018

BryanCutler commented Dec 6, 2018

viirya commented Dec 6, 2018

HyukjinKwon commented Dec 6, 2018

HyukjinKwon commented Dec 6, 2018

viirya commented Dec 6, 2018

BryanCutler commented Dec 6, 2018

[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

[SPARK-26275][PYTHON][ML] Increases timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test #23236

Conversation

HyukjinKwon commented Dec 5, 2018

What changes were proposed in this pull request?

How was this patch tested?

HyukjinKwon commented Dec 5, 2018

HyukjinKwon commented Dec 5, 2018

HyukjinKwon commented Dec 5, 2018 • edited Loading

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

SparkQA commented Dec 5, 2018

squito commented Dec 5, 2018

BryanCutler commented Dec 5, 2018

HyukjinKwon commented Dec 6, 2018

BryanCutler commented Dec 6, 2018

viirya commented Dec 6, 2018

HyukjinKwon commented Dec 6, 2018

HyukjinKwon commented Dec 6, 2018

viirya commented Dec 6, 2018

BryanCutler commented Dec 6, 2018

HyukjinKwon commented Dec 5, 2018 •

edited

Loading