-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rabit harden] include osx in tests, address time_wait on port assignment, enable all tests #90
Conversation
enable xgb-tests use chenqin/xgboost:master with updated path port packages from xgb enable test on osx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we fixed the test mentioned in #86 (comment)?
That should be another pr after we updated xgboost master. The rationale is previous and this pr haven't change how rabit functions other than reshuffle parameters and delete redundant code. I already got some idea how to fix flaky test by introducing extra check in allreduce_robust before reset links infinitely. But yeah, I think that should be seperate thing after we have good baseline dmlc/xgboost#4352 |
remove xgb java tests
for somereason, trybind actually allows two process bind to same port.
|
Done |
osx build error due to unable to download package. should be fine if rerun. local build here. |
@chenqin thanks, restarted the test. |
all tests green @CodingCat @trivialfis |
check my comments about mpi in osx |
hmmmm.....MPI in Mac doesn't compile....we missed some dependency? and test itself is still flaky, I restarted linux one (thought it was a Travis issue) but found mac one stuck there later (I didn't restart for your further investigation) |
Seems brew install can't do with cxx binding, needs to recompile from source everytime. Hopefully it can be done within 10mins before travis timeout. found this blog, going to try this path
Will take a look, this time is mac |
isn't the test still hanging? |
We need to fix dmlc-core first. |
|
looks like we have very consistent behavior in linux with pip3 |
friendly ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning the MPI tests. We should create an example with MPI in the future.
Merge. I hope you don't mind me changing the commit message. ;-) I'm more worry about dependency on |
follow up clean up dmlc-core header file copy pr https://github.com/chenqin/rabit/commit/ecd4bf7aaee477cea6ac14aa52f8276192f085f6.
more detail of deadlock rootcause can be found [rabit harden] address subprocess exit deadlock dmlc-core#524