-
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated ANTs binaries to solve slow processing in some systems #2642
Conversation
In response to the further report I've rebuilt the Linux binaries with static linking:
The files ended up in ANTS-build/Examples (not in bin/ like
It took over 2 hours to build all of ANTs, and I did it twice to make sure I'd done it right. There is a way to only build the programs we need, but so much of the build time was from ANTs compiling the ITK and VTK libraries from source and I doubt that can be avoided without also reworking its build script. Anyway, my binaries are at https://osf.io/r59va. I will ask our user to give them a try :) By the way, I did not include any of {isct_dice_coefficient,isct_propseg,isct_spine_detect,isct_train_svm} because those files weren't actually rebuilt in the version from https://osf.io/568y4/; you can check by comparing file hashes from that upload against a fresh SCT install: From @jcohenadad's upload at https://osf.io/568y4/:
And from https://osf.io/bt58d/:
You can see that these hashes haven't changed:
|
good decision-- there was no issue with these files so i decided to not recompile them. |
Where do |
I said this on Slack but for the record I don't think static building is a good long term solution to being cross-platform. Libraries only form one part of a platform; there's many other details like syscalls and endianness and word sizes to account for. The most reliable way to build for a particular platform is to build on that platform -- notwithstanding recent advances in cross-compiling that have made this easier but still not trivial. I am trying to prototype a cross-platform builder at kousu/ANTs#1. If I get it working, we should be able to cover all versions of Linux, OS X and Windows with the same build script, and have it done online where the whole team can watch the progress. |
Ah! great question. I've just added a new page on the wiki: https://github.com/neuropoly/spinalcordtoolbox/wiki/Compile-binaries |
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642
2774180
to
371dcac
Compare
I've updated the PR with new binaries, that are compiled properly, to the best of my knowledge. I made them using Github's CI, which makes them reproducible and conveniently downloadable. I still had to do a bunch of manual work to get them packed and distributed; that can be automated in a later PR. The ANTS binaries included in the updated links are from
The Linux build was done on centos7, even though my build script actually did a build for several distros. I've tried them (using Docker) on Ubuntu 14.04 through 18.04, Arch, and centos 7 and 8. They run everywhere. I learned this trick from python which considers "linux" binaries to be something built on centos 5, 6 or 7 with minimal dependencies -- basically just libc and everything else statically linked in -- which the ANTS programs are. The idea is that centos changes the slowest and forward-compatibility is easier than backwards-compatibility. I haven't tested their numerical stability; I haven't discovered how to do that directly, but hopefully the tests in batch_processing.sh will be able to tell us if anything changed drastically. This also has a couple of commits of related clean-up. TODO:
How The Files Were BuiltFirst I got into SCT's venv and made an empty workspace:
I found the latest binary URLs in the source code: and then downloaded them directly.
Do the same for my build:
Now I have these packages to merge:
Here's the unpacked 2019-09-30 version:
Now here's what our package looks like, notice that the dates have been updated:
Verify that they are sensible:
Finally, repack them:
|
I didn't set the permissions on OSF.io right. Reopening to retrigger Travis. |
e951d06
to
597ef72
Compare
Drat. On Linux and OS X 10.14 all the tests passed but for testing/test_sct_dmri_moco.py. It's giving these errors:
1 and 2 are, I'm guessing, a numerical stability issue (maybe we should be using
On OS X 10.13 and 10.12 it's failing with
which apparently is fixable by recompiling with |
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642
ac15dc2
to
982767b
Compare
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642
9715e96
to
acfb78a
Compare
hum, the GT and results are quite different (some rows have sign differences). I would like to visually inspect the quality of the registration with these new binaries.
this is likely related to a lack of overlap between the source and dest. Again, possibly related to poor registration performance, which we need to assess visually. Do you experience this issue in all OSs? (except OSX 10.13 and 10.12 as you mentioned) |
The mismatch with "ground truth" could be due to changes in ANTs that tweaked the default parameters and randomized some others: ANTsX/ANTs#976. We could try in particular setting the sampling density to "full" and seeing if that repairs the issue. |
Excellent, compiling the Mac build with
It's notable that the results change between each OS, though they are similar; I suspect this is actually that the results changing each run due to the randomization. When this run is done I'll trigger a re-run and record the results again. and then we will be able to see that, probably, the new results come out similarly but not identically. |
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642 and Issue #2669.
Upon upgrading our ANTS build we found that motion correction was no longer deterministic (#2642 (comment)). The ANTS maintainers suggested setting samplingStrategy=None (ANTsX/ANTs#976 (comment)) to get deterministic output, but they also warned that this will induce a variance-for-bias tradeoff: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTsR#210 (comment) * Thévenaz, P., Bierlaire, M., & Unser, M. (2008). Halt on sampling for image registration based on mutual information. Sampling Theory in Signal and Image Processing, 7(2). <http://bigwww.epfl.ch/preprints/thevenaz0602p.pdf> Because we are hard-coding the sampling rate I removed it from ParamMoco entirely. If we want to support it properly we would have to pass .sampling_strategy and .sampling_percentage separately.
This is also to make the output deterministic, at the cost of running slow. It turned out that using dense sampling wasn't enough; there was still some numerical instability that came from the order of addition: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues#variance-due-to-floating-point-precision-errors For some reason it only appeared on OS X, and only about 10% of the time, and never on Linux. I [showed](#2642 (comment)) that the instability in isct_antsSliceRegularizedRegistration did exist on Linux, so something still unknown about how we call it was hiding it there. This was actually supposed to be in place already but the code had atrophied, so all this does is fix it up. See: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTs#444 (comment) * ANTsX/ANTsR#210 (comment)
016882d
to
ea2ddb1
Compare
Previously we treated linux as two kinds of beasts: Debian and CentOS 6. But we conflated Debian with all other linuxes, so really it was "linux" and "centos6". Going forward, I want to do builds the way PyPI does with manylinux [1]: the oldest CentOS is considered linux full stop. With my recent rebuild, that is CentOS 7. Update the package name to emphasize this clearly. I picked CentOS 7 because that's the oldest version that ANTS will build on, but I'm temporarily keeping a special case for CentOS 6 for the next ~6 months until it end-of-lifes. [1] https://github.com/pypa/manylinux
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642 and Issue #2669.
Upon upgrading our ANTS build we found that motion correction was no longer deterministic (#2642 (comment)). The ANTS maintainers suggested setting samplingStrategy=None (ANTsX/ANTs#976 (comment)) to get deterministic output, but they also warned that this will induce a variance-for-bias tradeoff: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTsR#210 (comment) * Thévenaz, P., Bierlaire, M., & Unser, M. (2008). Halt on sampling for image registration based on mutual information. Sampling Theory in Signal and Image Processing, 7(2). <http://bigwww.epfl.ch/preprints/thevenaz0602p.pdf> Because we are hard-coding the sampling rate I removed it from ParamMoco entirely. If we want to support it properly we would have to pass .sampling_strategy and .sampling_percentage separately.
This is also to make the output deterministic, at the cost of running slow. It turned out that using dense sampling wasn't enough; there was still some numerical instability that came from the order of addition: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues#variance-due-to-floating-point-precision-errors For some reason it only appeared on OS X, and only about 10% of the time, and never on Linux. I [showed](#2642 (comment)) that the instability in isct_antsSliceRegularizedRegistration did exist on Linux, so something still unknown about how we call it was hiding it there. This was actually supposed to be in place already but the code had atrophied, so all this does is fix it up. See: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTs#444 (comment) * ANTsX/ANTsR#210 (comment)
In updating antsSliceRegularizedRegistration, and its samplingStrategy to get the new version to be as deterministic as the 2016 version, the estimates it gives on the test data have shifted. This updates the snapshots we had of the values to the new values. I am unsure if this is right. Maybe the old estimates were better?
ea2ddb1
to
d2c63fa
Compare
This reverts commit d77b77a.
Upon upgrading our ANTS build we found that motion correction was no longer deterministic (#2642 (comment)). The ANTS maintainers suggested setting samplingStrategy=None (ANTsX/ANTs#976 (comment)) to get deterministic output, but they also warned that this will induce a variance-for-bias tradeoff: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTsR#210 (comment) * Thévenaz, P., Bierlaire, M., & Unser, M. (2008). Halt on sampling for image registration based on mutual information. Sampling Theory in Signal and Image Processing, 7(2). <http://bigwww.epfl.ch/preprints/thevenaz0602p.pdf> I think ideally moco.py would thinly wrap antsRegistration, and expose both .sampling_strategy and .sampling_percentage directly, but until now we've always hardcoded 'samplingStrategy=Regular' so in the interest of not rocking the boat too much all I did was add a 'sampling=None' case, *and set it as the default* Updated version re feedback: #2642 (comment)
Performance/eyeballing comparisons: Macbook progit-master-6f0c8271d9bf842a61d1325732ec997c74d9aac3
git-jca/947-new-ants-binaries-4ea484182fd7a7f334ff946f28fd1715ae385ade
Visual inspection of the registrations (for moco, template registration and inter-modal registration): I see negligible differences. Ubuntu (joplin)git-master-6f0c8271d9bf842a61d1325732ec997c74d9aac3
git-jca/947-new-ants-binaries-4ea484182fd7a7f334ff946f28fd1715ae385ade
awesome fix @kousu! Can we merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please. 🚀
These binaries were built via a Github Workflow. The Linux binaries were built on centos7, which is the oldest Linux that ANTS will still compile on. The MacOS binaries were built on the Github MacOS VMs. The other binaries were copied unchanged from the previous binaries. For full details see Github PR #2642 and Issue #2669.
Upon upgrading our ANTS build we found that motion correction was no longer deterministic (#2642 (comment)). The ANTS maintainers suggested setting samplingStrategy=None (ANTsX/ANTs#976 (comment)) to get deterministic output, but they also warned that this will induce a variance-for-bias tradeoff: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTsR#210 (comment) * Thévenaz, P., Bierlaire, M., & Unser, M. (2008). Halt on sampling for image registration based on mutual information. Sampling Theory in Signal and Image Processing, 7(2). <http://bigwww.epfl.ch/preprints/thevenaz0602p.pdf> Because we are hard-coding the sampling rate I removed it from ParamMoco entirely. If we want to support it properly we would have to pass .sampling_strategy and .sampling_percentage separately.
This is also to make the output deterministic, at the cost of running slow. It turned out that using dense sampling wasn't enough; there was still some numerical instability that came from the order of addition: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues#variance-due-to-floating-point-precision-errors For some reason it only appeared on OS X, and only about 10% of the time, and never on Linux. I [showed](#2642 (comment)) that the instability in isct_antsSliceRegularizedRegistration did exist on Linux, so something still unknown about how we call it was hiding it there. This was actually supposed to be in place already but the code had atrophied, so all this does is fix it up. See: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTs#444 (comment) * ANTsX/ANTsR#210 (comment)
When running it in some systems, motion correction is very slow. The solution was to re-build ANTs binaries on Ubuntu 16.04 and OSX. Fixes #947
In this PR, we also take the opportunity to review the various distros that cause problems with some of the Python's libraries, causing problems such as #2523.
TODO:
Compatibility of new Linux binaries (compiled on Ubuntu 16.04) with CentOS:
CentOS8 --> OK
CentOS6 --> FAIL
Compatibility of Python's libraries
CentOS8 --> OK
CentOS6 --> FAIL
testing
data + syntax:
sct_download_data -d sct_example_data cd sct_example_data/mt sct_register_multimodal -i mt1.nii.gz -d mt0.nii.gz -param step=1,type=im,algo=slicereg,iter=50 -v 2
before merging
Please run
batch_processing.sh
and report output numbers, as done for each release or PRs that likely impact those numbers. If numbers are different it would be nice to report it with the label "compatibility API/breakage"Depends on: #2699, and #2690.