Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{lib}[fosscuda/2019b] TensorFlow v2.4.1 w/ Python 3.7.4 #11637

Merged

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented Nov 9, 2020

(created using eb --new-pr)

The new Bazel version is not strictly required but it does require a statically build Bazel and the easiest way to enforce this is to require a new version built with easybuilders/easybuild-easyblocks#2285

@Flamefire Flamefire marked this pull request as draft November 9, 2020 08:17
@Micket Micket added the update label Nov 12, 2020
@Micket Micket added this to the 4.3.2 milestone Nov 12, 2020
@boegel boegel modified the milestones: 4.3.2 (next release), 4.4.0 Dec 8, 2020
@Flamefire Flamefire marked this pull request as ready for review December 22, 2020 12:19
…nd patches: TensorFlow-2.4.0_add-default-shell-env.patch, TensorFlow-2.4.0_add-protobuf-deps.patch, TensorFlow-2.4.0_downgrade-required-versions.patch, TensorFlow-2.4.0_fix-eigen-on-power.patch, TensorFlow-2.4.0_replace-exectools-to-tools.patch
@Flamefire Flamefire force-pushed the 20201109091652_new_pr_TensorFlow240 branch from d911285 to b9af6d4 Compare December 22, 2020 16:37
@easybuilders easybuilders deleted a comment from boegelbot Dec 28, 2020
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusml1 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/16b267207e3c3348ce78dbb366334337 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusa14 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, Python 2.7.5
See https://gist.github.com/02c3f1718f75b1f08012d37e9b8f293d for a full test report.

@Flamefire
Copy link
Contributor Author

Just got a notification that TF 2.4.1 is released, seems like only to make it more compatible, so I'd update this PR if that works: https://github.com/tensorflow/tensorflow/releases/tag/v2.4.1
Some easyblock changes are still pending so I guess that is fine?

@Flamefire Flamefire changed the title {lib}[fosscuda/2019b] TensorFlow v2.4.0 w/ Python 3.7.4 {lib}[fosscuda/2019b] TensorFlow v2.4.1 w/ Python 3.7.4 Jan 22, 2021
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusml14 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/924ce7387a08acb08d69b4df85a4f535 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (4 easyconfigs in total)
taurusa6 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, Python 2.7.5
See https://gist.github.com/24f8a260a9a198e7f0dbfd751bfaa1f2 for a full test report.

@Flamefire Flamefire force-pushed the 20201109091652_new_pr_TensorFlow240 branch from f3424c7 to 5ee669b Compare February 5, 2021 17:45
@Flamefire Flamefire force-pushed the 20201109091652_new_pr_TensorFlow240 branch from 5ee669b to 229e342 Compare February 5, 2021 17:46
@boegel
Copy link
Member

boegel commented Feb 12, 2021

@boegelbot please test @ generoso
EB_ARGS="--include-easyblocks-from-pr 2312"
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11637 EB_ARGS="--include-easyblocks-from-pr 2312" /apps/slurm/default/bin/sbatch --job-name test_PR_11637 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13808

Test results coming soon (I hope)...

- notification for comment with ID 778280781 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegel
Copy link
Member

boegel commented Feb 12, 2021

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3501.doduo.os - Linux RHEL 8.2, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/2b9f63e3ba99020141c068774cc93ca0 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
taurusml2 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), Python 2.7.5
See https://gist.github.com/b61ca5accc98def0a15b0856f9aed018 for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/fd80d4d1d4283ec2290036a78ceccd96 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
taurusa6 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz (broadwell), Python 2.7.5
See https://gist.github.com/9d8e98ff6f743385727fcf64be847e14 for a full test report.

@boegel
Copy link
Member

boegel commented Feb 13, 2021

Test report by @boegel
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
FAILED
Build succeeded for 3 out of 4 (3 easyconfigs in total)
node3112.skitty.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz, Python 3.6.8
See https://gist.github.com/6c3578797ca627336604563119bad2aa for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#2312
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
bear-pg0212u15b.bear.cluster - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (broadwell), Python 3.6.8
See https://gist.github.com/f780b16ac0044882a32b86b38ac6e490 for a full test report.

@boegel
Copy link
Member

boegel commented Feb 13, 2021

Still several problematic tests, it seems...

All this testing does show that easybuilders/easybuild-easyblocks#2312 is working as intended though, so I'll go ahead and merge that.

@Flamefire
Copy link
Contributor Author

@branfosj The error from your run is:

Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

This could be fixed by enabling permissions, see tensorflow/tensorflow#35860 (comment)

Maybe we should exclude that test?

@branfosj
Copy link
Member

@branfosj The error from your run is:

Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES

This could be fixed by enabling permissions, see tensorflow/tensorflow#35860 (comment)

Maybe we should exclude that test?

Yes, we should exclude that test.

@Flamefire
Copy link
Contributor Author

Done. Could you rerun the test?

@branfosj
Copy link
Member

Test report by @branfosj
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
bear-pg0212u17a.bear.cluster - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (broadwell), Python 3.6.8
See https://gist.github.com/859596eabe55031041f29db1b232277a for a full test report.

@boegel
Copy link
Member

boegel commented Feb 15, 2021

Test report by @boegel
FAILED
Build succeeded for 2 out of 3 (3 easyconfigs in total)
node3309.joltik.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), Python 3.6.8
See https://gist.github.com/2867327332eb6f18ba5dc785be7a1d43 for a full test report.

@boegel
Copy link
Member

boegel commented Feb 15, 2021

@boegelbot please test @ generoso
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11637 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11637 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13831

Test results coming soon (I hope)...

- notification for comment with ID 779390737 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegel
Copy link
Member

boegel commented Feb 15, 2021

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3501.doduo.os - Linux RHEL 8.2, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.6.8
See https://gist.github.com/8e9e073349b93d15b4bb182c358bc852 for a full test report.

@boegel
Copy link
Member

boegel commented Feb 15, 2021

Test report by @boegel
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
node3309.joltik.os - Linux centos linux 7.9.2009, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), Python 3.6.8
See https://gist.github.com/fce43524a9fa9e5984e0642992783ca8 for a full test report.

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
generoso-x-2 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/20723cd20cad65250f8e1d6c40f40311 for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Member

boegel commented Feb 16, 2021

Going in, thanks @Flamefire!

@boegel boegel merged commit 67278ae into easybuilders:develop Feb 16, 2021
@Flamefire Flamefire deleted the 20201109091652_new_pr_TensorFlow240 branch February 16, 2021 17:43
@boegel boegel mentioned this pull request Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants