Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{lib}[fosscuda/2019b] TensorFlow v2.2.0 w/ Python 3.7.4 #10600

Merged

Conversation

Flamefire
Copy link
Contributor

@Flamefire Flamefire commented May 8, 2020

(created using eb --new-pr)

Can be used as a starting point for fosscuda/2020a once CUDA is out. Not sure if another TF version is acceptable for fosscuda/2019b but it has multiple already and it seems fosscuda/2020a will take a while...

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in this PR)
taurusml30 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/4b97825babefb2309a358ac8b46c082d for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusa6 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, Python 2.7.5
See https://gist.github.com/90fd5878352f360eeeeef71c3706dc8b for a full test report.

@Micket
Copy link
Contributor

Micket commented May 8, 2020

Test report by @Micket
SUCCESS
Build succeeded for 3 out of 3 (2 easyconfigs in this PR)
vera-c1 - Linux centos linux 7.7.1908, Intel Xeon Processor (Skylake), Python 2.7.5
See https://gist.github.com/7392c8578c72373d0dd82c5246117d3a for a full test report.

@JackPerdue
Copy link
Contributor

JackPerdue commented May 12, 2020

Might I suggest you provide an example of how to provide cuda_compute_capabilites.

  1. at present it uses a default value
  2. it isn't clear at all that the TensorFlow extension is using the tensorflow.py easyblock
  3. it isn't clear at all (without a long search of readthedocs) on how to add it

Here, I just did a test by adding the setting to the TensorFlow extension block (at the bottom):
'cuda_compute_capabilities': ['3.5', '3.7'], # for Tesla K20 (ada) and K80 (terra)

For your easyconfig, I'd recommend the usual EB default of something like:
'cuda_compute_capabilities': ['3.0', '3.2', '3.5', '3.7', '5.0', '5.2', '6.0', '6.1', '7.0']

Now then... if I had my druthers, I'd explicitly indicate the easyblock it is using. Of course, I prefer that in all easyconfigs. Alas, the EB team disagrees and won't accept easyconfigs that explicitly list the easyblock used instead some "magic" that always infers the easyblock from whatever is being built. But maybe they will it in things like PythonBundle (I just assumed it was using PythonPackage for extensions).

In any case, providing a cuda_compute_capabilites would be a good idea.

@Flamefire
Copy link
Contributor Author

@JackPerdue

  1. Using a default value is fine
  2. It is clear as no EasyBlock is explicitely specified meaning "tensorflow" uses the EasyBlock "TensorFlow"
  3. eb --help

For your easyconfig, I'd recommend the usual EB default of something like:
'cuda_compute_capabilities': ['3.0', '3.2', '3.5', '3.7', '5.0', '5.2', '6.0', '6.1', '7.0']

That would be extremely wasteful. Each site has to configure this to their own hardware just like any other site-specific config variable. This is not for the EasyConfig to decide

Alas, the EB team disagrees and won't accept easyconfigs that explicitly list the easyblock used instead some "magic" that always infers the easyblock from whatever is being built.

What's wrong with that? If nothing is specified the name is used. Couldn't get simpler

In any case, providing a cuda_compute_capabilites would be a good idea.

If you don't overwrite it, a default is used by TF which isn't that bad. Not sure what it was exactly but IIRC it is something like 3.5, 7.0.

@Micket Micket added the update label May 14, 2020
@Micket Micket added this to the next release (4.2.1?) milestone May 14, 2020
@Micket
Copy link
Contributor

Micket commented May 14, 2020

Regarding the cuda_compute_capabilities, it should print a verbose warning to the terminal, as well as the log with instructions on what one should do if one hasn't set the cuda_compute_capabilities.
Though, there was a confusing typo in the warning, which will be fixed in next release
easybuilders/easybuild-easyblocks#2057

@Micket
Copy link
Contributor

Micket commented May 14, 2020

Test report by @Micket
SUCCESS
Build succeeded for 3 out of 3 (2 easyconfigs in this PR)
hebbe-c1 - Linux centos linux 7.7.1908, Intel Core Processor (Haswell, no TSX), Python 2.7.5
See https://gist.github.com/63712b8fd24eb871ac805bd9e26af3c8 for a full test report.

Copy link
Contributor

@Micket Micket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Micket
Copy link
Contributor

Micket commented May 14, 2020

Going in, thanks @Flamefire!

@Micket Micket merged commit 40869de into easybuilders:develop May 14, 2020
@Flamefire Flamefire deleted the 20200508100357_new_pr_TensorFlow220 branch May 14, 2020 16:19
@easybuilders easybuilders deleted a comment from boegelbot Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants