-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19955][PySpark] Jenkins Python Conda based test. #17355
[SPARK-19955][PySpark] Jenkins Python Conda based test. #17355
Conversation
…thon2.7 right now
Test build #74853 has started for PR 17355 at commit |
Jenkins retest this please |
Test build #74871 has finished for PR 17355 at commit
|
Jenkins retest this please |
Test build #74998 has finished for PR 17355 at commit
|
Test build #75025 has finished for PR 17355 at commit
|
Jenkins retest this please. |
Test build #75033 has started for PR 17355 at commit |
…ning version for debugging
Test build #75057 has finished for PR 17355 at commit
|
Test build #75063 has finished for PR 17355 at commit
|
Test build #75073 has finished for PR 17355 at commit
|
… a current setuptools
Test build #75081 has started for PR 17355 at commit |
Jenkins retest this please |
Test build #75090 has finished for PR 17355 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@holdenk , I ran run-tests
which worked fine and tried out run-pip-tests
with USE_CONDA set. I ran into some of the above issues and then finally get this error: error: package directory 'pyspark/ml/stat' does not exist
Looks like from here https://github.com/apache/spark/blob/master/python/setup.py#L170 and I don't see that module exists, is that right?
After I removed ml.stats
module from there, the tests ran.
pip install --upgrade pip pypandoc wheel | ||
pip install numpy # Needed so we can verify mllib imports | ||
if [ -n "$USE_CONDA" ]; then | ||
conda create -y -p "$VIRTUALENV_PATH" python=$python numpy pandas pip setuptools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting python=$python
led to "python=3" which then tried to install python 3.6
+ conda create -y -p /tmp/tmp.OymEZOKFzo/3 python=3 numpy pandas pip setuptools
Fetching package metadata .........
Solving package specifications: .
Package plan for installation in environment /tmp/tmp.OymEZOKFzo/3:
The following NEW packages will be INSTALLED:
mkl: 2017.0.1-0
numpy: 1.12.1-py36_0
openssl: 1.0.2k-1
pandas: 0.19.2-np112py36_1
pip: 9.0.1-py36_1
python: 3.6.1-0
...
And that led to a conflict with pypandoc:
UnsatisfiableError: The following specifications were found to be in conflict:
- pypandoc -> python 3.5* -> sqlite 3.9.*
- pypandoc -> python 3.5* -> xz 5.0.*
- python 3.6*
manually setting "python=3.5" seemed to clear things up so it could complete the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds reasonable, for packaging I've made it explicitly request Python 3.5 (at some point if PyPandoc doesn't make it into 3.6 on conda forge we should ping them but no rush).
pip install numpy # Needed so we can verify mllib imports | ||
if [ -n "$USE_CONDA" ]; then | ||
conda create -y -p "$VIRTUALENV_PATH" python=$python numpy pandas pip setuptools | ||
source activate "$VIRTUALENV_PATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add this line after source activate ..
to get pypandoc installed
conda install -y -c conda-forge pypandoc
Otherwise I got this error:
Could not import pypandoc - required to package PySpark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So its not a hard error, and since the workers don't have pandoc installed (a separate binary) leaving it out for now seems like the easist path. Once we're all dockerized and happy we can add pandoc & pypandoc to the docker image.
python_execs = [x for x in ["python2.6", "python3.4", "pypy"] if which(x)] | ||
if "python2.6" not in python_execs: | ||
LOGGER.warning("Not testing against `python2.6` because it could not be found; falling" | ||
python_execs = [x for x in ["python2.7", "python3.4", "pypy"] if which(x)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we are not supporting 2.6 anymore!?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed we've been talking about removing it but it's been blocked on Jenkins work.
Test build #75138 has finished for PR 17355 at commit
|
Test build #75224 has finished for PR 17355 at commit
|
.@bryanxutler so I left out pypandoc because there isn't pandoc on the machines and it's optional (prints a warning to stderr - but should work fine). I get back from vacation next week so let's chat then :) |
Oops @BryanCutler damn phone keyboard. |
…changed shellscripts
cc @JoshRosen & @shaneknapp : this PR allows us to keep our existing Jenkins worker setup while still moving away from 2.6 to 2.7 & enables pip packaging tests in Jenkins. |
Test build #75274 has finished for PR 17355 at commit
|
changes look good. i'll give it a closer look tomorrow... i've been out
of town and down w/bronchitis for the past week and a half.
…On Mon, Mar 27, 2017 at 12:19 PM, Holden Karau ***@***.***> wrote:
cc @JoshRosen <https://github.com/joshrosen> & @shaneknapp
<https://github.com/shaneknapp> : this PR allows us to keep our existing
Jenkins worker setup while still moving away from 2.6 to 2.7 & enables pip
packaging tests in Jenkins.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17355 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABiDrIr3eoHyqOSUPRP0zti_zfWomTPBks5rqAutgaJpZM4MiBuI>
.
|
Hope you feel better soon @shaneknapp :) |
dev/run-pip-tests
Outdated
PYTHON_EXECS+=('python3') | ||
fi | ||
|
||
set -x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this just there for debugging? if so, pls remove before merging. otherwise, consider sticking it at the beginning of the script.
Test build #75333 has finished for PR 17355 at commit
|
lgtm++ |
Great, yay 2.6 deprecation adventures :) |
Merged to master. Please do not backport. |
What changes were proposed in this pull request?
Allow Jenkins Python tests to use the installed conda to test Python 2.7 support & test pip installability.
How was this patch tested?
Updated shell scripts, ran tests locally with installed conda, ran tests in Jenkins.