Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add patches to fix PyTorch 1.12 (CPU-only) w/ foss/2022a on POWER #18490

Merged

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented Aug 8, 2023

(created using eb --new-pr)

This adds patches to fix failures when FBGEMM isn't used which AFAIK is only the case on PPC. In particular the forward-ported patch from #18489 and a new one due to a bug in a feature introduced in PyTorch 1.12.0

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusi8024 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/Flamefire/a1607ccafcc628f7f842e6d360687fc8 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 3 out of 4 (4 easyconfigs in total)
taurusml5 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/da380ec41c9a25320533198c9f81311f for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusml5 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/133c1119453657bbc43422e4c24bf0b4 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusml24 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/21b3c8dd4a03817b64e50d524b816eb3 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2983
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
bear-pg0105u03a.bear.cluster - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/f93fc9d14a3b42dd6b9d2bd54c26f1ab for a full test report.

@boegel boegel changed the title Fix PyTorch 1.12 (CPU version) on POWER add patches to fix PyTorch 1.12 (CPU-only) w/ foss/2022a on POWER Aug 15, 2023
@boegel boegel added the bug fix label Aug 15, 2023
@boegel boegel added this to the next release (4.8.1?) milestone Aug 15, 2023
@boegel
Copy link
Member

boegel commented Aug 17, 2023

Going in, thanks @Flamefire!

@boegel boegel merged commit 4727995 into easybuilders:develop Aug 17, 2023
@Flamefire Flamefire deleted the 20230808142148_new_pr_PyTorch1120 branch August 22, 2023 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants