Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-11295 Add packages to JUnit output for Python tests #9263

Closed
wants to merge 1 commit into from
Closed

SPARK-11295 Add packages to JUnit output for Python tests #9263

wants to merge 1 commit into from

Conversation

gliptak
Copy link
Contributor

@gliptak gliptak commented Oct 24, 2015

SPARK-11295 Add packages to JUnit output for Python tests

This improves grouping/display of test case results.

@JoshRosen
Copy link
Contributor

Jenkins, this is ok to test.

@SparkQA
Copy link

SparkQA commented Oct 24, 2015

Test build #44299 has finished for PR 9263 at commit e5e03b0.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 24, 2015

Test build #44300 has finished for PR 9263 at commit 52c3410.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Oct 24, 2015

pyspark.tests shows up

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44300/testReport/

I will work the others.

@SparkQA
Copy link

SparkQA commented Oct 24, 2015

Test build #44302 has finished for PR 9263 at commit 5a04363.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Oct 24, 2015

The test errors with:

Finished test(python2.6): pyspark.mllib.util (9s)
Traceback (most recent call last):
  File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
    exec code in run_globals
  File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/mllib/tests.py", line 1543, in <module>
    from pyspark.mllib.tests import *
  File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/mllib/tests.py", line 79, in <module>
    sc = SparkContext('local[4]', "MLlib tests")
  File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/context.py", line 111, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/context.py", line 257, in _ensure_initialized
    callsite.function, callsite.file, callsite.linenum))
ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=MLlib tests, master=local[4]) created by <module> at /usr/lib64/python2.6/runpy.py:34 

Had test failures in pyspark.mllib.tests with python2.6; see logs.
[error] running /home/jenkins/workspace/SparkPullRequestBuilder@2/python/run-tests --modules=pyspark-core,pyspark-sql,pyspark-streaming,pyspark-mllib,pyspark-ml --parallelism=4 ; received return code 255

@gliptak
Copy link
Contributor Author

gliptak commented Oct 25, 2015

@SparkQA
Copy link

SparkQA commented Oct 25, 2015

Test build #44310 has finished for PR 9263 at commit b423482.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -76,7 +76,8 @@
pass

ser = PickleSerializer()
sc = SparkContext('local[4]', "MLlib tests")
conf = SparkConf().set("spark.driver.allowMultipleContexts", "true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it relevant to this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mengxr This was my (failing) attempt to correct the test errors (caused by --parallelism=4?). Maybe @JoshRosen could comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you shouldn't set this. The parallelism in dev/run-tests will actually launch separate JVMs, so that's not the cause of this problem. In general, you should never set spark.driver.allowMultipleContexts (it was only added as an escape-hatch backwards-compatibility option for a feature that we never properly supported).

There must be some other problem in the tests, likely due to test cleanup or SparkContext teardown not being executed properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing the tests.py-s

https://github.com/apache/spark/blob/master/python/pyspark/streaming/tests.py

initiates SparkContext differently:

    @classmethod
    def setUpClass(cls):
        class_name = cls.__name__
        conf = SparkConf().set("spark.default.parallelism", 1)
        cls.sc = SparkContext(appName=class_name, conf=conf)
        cls.sc.setCheckpointDir("/tmp")

    @classmethod
    def tearDownClass(cls):
        cls.sc.stop()
        # Clean up in the JVM just in case there has been some issues in Python API
        try:
            jSparkContextOption = SparkContext._jvm.SparkContext.get()
            if jSparkContextOption.nonEmpty():
                jSparkContextOption.get().stop()
        except:
            pass

Could this approach be retrofitted into https://github.com/apache/spark/blob/master/python/pyspark/mllib/tests.py to allow for concurrency?

@SparkQA
Copy link

SparkQA commented Oct 28, 2015

Test build #44478 has finished for PR 9263 at commit 8dc37f8.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Oct 28, 2015

This last run had a different failure than the previous run with the same code ...

@gliptak
Copy link
Contributor Author

gliptak commented Nov 3, 2015

@JoshRosen Would you have some pointers on how to move this forward? Thanks

@mengxr
Copy link
Contributor

mengxr commented Nov 6, 2015

test this please

@mengxr
Copy link
Contributor

mengxr commented Nov 6, 2015

add to whitelist

@SparkQA
Copy link

SparkQA commented Nov 6, 2015

Test build #45227 has finished for PR 9263 at commit 8dc37f8.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Nov 6, 2015

@gliptak I think this is the cause: https://github.com/apache/spark/blob/master/python/pyspark/mllib/tests.py#L79. We didn't initialize SparkContext in setUp. So by import * it creates a new SparkContext. Could you move it to setUp and retry at your local?

@gliptak
Copy link
Contributor Author

gliptak commented Nov 7, 2015

@mengxr Thank you for the pointer. This worked locally with

$ ./run-tests --python-executables=python --modules=pyspark-mllib

@gliptak
Copy link
Contributor Author

gliptak commented Nov 8, 2015

@mengxr Could you trigger a build?

@SparkQA
Copy link

SparkQA commented Nov 8, 2015

Test build #45309 has finished for PR 9263 at commit f253013.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 9, 2015

Test build #45325 has finished for PR 9263 at commit 40b1a65.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Nov 9, 2015

Failure for only one of the Pythons version?

FAIL: test_trainOn_predictOn (pyspark.mllib.tests.StreamingKMeansTest)
Test that prediction happens on the updated model.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1198, in test_trainOn_predictOn
    self._eventually(condition, catch_assertions=True)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 128, in _eventually
    raise lastValue
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 119, in _eventually
    lastValue = condition()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1195, in condition
    self.assertEqual(predict_results, [[0, 1, 1], [1, 0, 1]])
AssertionError: Lists differ: [[0, 1, 1], [0, 0, 0]] != [[0, 1, 1], [1, 0, 1]]

The unit tests page shows no errors:

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45325/testReport/

@gliptak
Copy link
Contributor Author

gliptak commented Nov 10, 2015

Are JUnit test results separated based on Java version running under?

@gliptak
Copy link
Contributor Author

gliptak commented Nov 12, 2015

#9669

@gliptak
Copy link
Contributor Author

gliptak commented Nov 14, 2015

Rebased to current master

@SparkQA
Copy link

SparkQA commented Nov 14, 2015

Test build #45936 has finished for PR 9263 at commit f733447.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Nov 14, 2015

Unit test timed out?

======================================================================
ERROR: test_kafka_direct_stream (pyspark.streaming.tests.KafkaStreamTests)
Test the Python direct Kafka stream API.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/streaming/tests.py", line 753, in setUp
    self._kafkaTestUtils.setup()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o10749.setup.
: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)

@zsxwing
Copy link
Member

zsxwing commented Dec 11, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Dec 11, 2015

Test build #47554 has finished for PR 9263 at commit f733447.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Dec 14, 2015

@gliptak
Copy link
Contributor Author

gliptak commented Dec 20, 2015

@zsxwing Are there some changes you would like to see to this pull request?

@JoshRosen
Copy link
Contributor

Hey, sorry for forgetting about this. I'm going to trigger a retest now and will take a look at the test results.

@gliptak, could you please update the PR description to say something other than "WIP", since the PR description will become the commit message? Feel free to copy description from JIRA if you'd like. I'd also be nice to add a sentence or two summarizing the changes that you needed to make to get this to work.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@gliptak
Copy link
Contributor Author

gliptak commented Jan 13, 2016

@JoshRosen Please let me know if you would like to see other description changes.

@SparkQA
Copy link

SparkQA commented Jan 13, 2016

Test build #49329 has finished for PR 9263 at commit f733447.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -75,22 +75,23 @@
# No SciPy, but that's okay, we'll skip those tests
pass

ser = PickleSerializer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why sc needs to be created on a per-test-case basis, but is it possible to leave ser as-is and keep it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask because it looks like this change conflicted with a recently-modified test, causing the test to break:

======================================================================
ERROR [7.323s]: test_als_ratings_id_long_error (pyspark.mllib.tests.ALSTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1571, in test_als_ratings_id_long_error
    self.assertRaises(Py4JJavaError, self.sc._jvm.SerDe.loads, bytearray(ser.dumps(r)))
NameError: global name 'ser' is not defined

======================================================================
ERROR [0.779s]: test_als_ratings_serialize (pyspark.mllib.tests.ALSTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1562, in test_als_ratings_serialize
    jr = self.sc._jvm.SerDe.loads(bytearray(ser.dumps(r)))
NameError: global name 'ser' is not defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall seeing some concurrency issues with ser too (it has been a while). I'm pushing up a rebase/update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by concurrency errors? AFAIK we run tests serially in these files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I was running with parallel flag locally. It has been a while ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoshRosen From the log the tests are running parallel=4.

/home/jenkins/workspace/SparkPullRequestBuilder@2/python/run-tests --modules=pyspark-core,pyspark-sql,pyspark-streaming,pyspark-mllib,pyspark-ml --parallelism=4 

I will roll back the ser changes in a few.

@JoshRosen
Copy link
Contributor

By the way, aside from the ser issue this looks good to me.

For other reviewers: this change makes the test output much nicer in Jenkins (and thus nicer on https://spark-tests.appspot.com):

image

One interesting thing: Jenkins seems to fail when trying to show package-level test information for the PySpark tests: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49329/testReport/pyspark.tests/. I wonder whether this could be related to the presence of multiple XML reports from concurrent PySpark test runs with different Python versions.

@gliptak
Copy link
Contributor Author

gliptak commented Jan 13, 2016

Yes, the multiple reports might play into it. I have no visibility into Jenkins to assess further.

@SparkQA
Copy link

SparkQA commented Jan 14, 2016

Test build #49349 has finished for PR 9263 at commit f23fb62.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 14, 2016

Test build #49357 has finished for PR 9263 at commit 99d63c5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gliptak
Copy link
Contributor Author

gliptak commented Jan 14, 2016

FAIL: test_trainOn_predictOn (pyspark.mllib.tests.StreamingKMeansTest)
Test that prediction happens on the updated model.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1211, in test_trainOn_predictOn
    self._eventually(condition, catch_assertions=True)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 129, in _eventually
    raise lastValue
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 120, in _eventually
    lastValue = condition()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests.py", line 1208, in condition
    self.assertEqual(predict_results, [[0, 1, 1], [1, 0, 1]])
AssertionError: Lists differ: [[0, 1, 1], [0, 0, 0]] != [[0, 1, 1], [1, 0, 1]]

First differing element 1:
[0, 0, 0]
[1, 0, 1]

- [[0, 1, 1], [0, 0, 0]]
?                 ^^^^

+ [[0, 1, 1], [1, 0, 1]]
?              +++   ^

@gliptak
Copy link
Contributor Author

gliptak commented Jan 14, 2016

@JoshRosen Would you like to see some other changes?

@JoshRosen
Copy link
Contributor

I'd like to see what happens if you roll back that ser change.

@mengxr
Copy link
Contributor

mengxr commented Jan 15, 2016

@gliptak @JoshRosen I think we shouldn't block this feature because of some failed MLlib unit tests. Feel free to disable the tests in this PR and create JIRAs under components "MLlib" and "PySpark" to track them. We could fix them in a follow-up PR.

@JoshRosen
Copy link
Contributor

I still maintain that we should roll back the ser change if we don't know why it's necessary.

@SparkQA
Copy link

SparkQA commented Jan 16, 2016

Test build #49531 has finished for PR 9263 at commit c524ab0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in c6f971b Jan 19, 2016
@mengxr
Copy link
Contributor

mengxr commented Jan 19, 2016

LGTM. Merged into master. Thanks for making this work! Btw, @gliptak please do not squash your commits into one when you push new changes. It is easier to see what you changed if you keep the old commits unmodified. Thanks!

@gliptak
Copy link
Contributor Author

gliptak commented Jan 19, 2016

@mengxr I will keep that in mind. Do you usually squash commits before committing to master? Thanks

@JoshRosen
Copy link
Contributor

@gliptak, our merge script automatically squashes at commit time, so there's no need for you to do it yourself.

@gliptak
Copy link
Contributor Author

gliptak commented Jan 19, 2016

@JoshRosen I see. Thanks

@mengxr
Copy link
Contributor

mengxr commented Jan 20, 2016

@gliptak I reverted the change because 0ddba6d was merged before this one and it didn't use self.sc, which causes Jenkins failures. Could you reopen this PR and update your patch?

Btw, this line needs a patch: 0ddba6d#diff-ce16909e38fc8bee429dc638b2b2dde2R426.

mengxr pushed a commit to mengxr/spark that referenced this pull request Jan 20, 2016
SPARK-11295 Add packages to JUnit output for Python tests

This improves grouping/display of test case results.

Author: Gábor Lipták <gliptak@gmail.com>

Closes apache#9263 from gliptak/SPARK-11295.
@mengxr
Copy link
Contributor

mengxr commented Jan 20, 2016

I made a new PR with fix: #10850

asfgit pushed a commit that referenced this pull request Jan 20, 2016
This is #9263 from gliptak (improving grouping/display of test case results) with a small fix of bisecting k-means unit test.

Author: Gábor Lipták <gliptak@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes #10850 from mengxr/SPARK-11295.
@gliptak
Copy link
Contributor Author

gliptak commented Jan 20, 2016

@mengxr Thank you (I didn't get to this last night)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants