Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{lib}[foss/2022a] TensorFlow v2.13.0 #18644

Conversation

Flamefire
Copy link
Contributor

(created using eb --new-pr)

…orFlow-2.13.0_add-default-shell-env.patch, TensorFlow-2.13.0_add-missing-snappy-function.patch, TensorFlow-2.13.0_add-missing-system-absl-py-target.patch, TensorFlow-2.13.0_add-missing-system-protobuf-targets.patch, TensorFlow-2.13.0_exclude-xnnpack-on-ppc.patch, TensorFlow-2.13.0_fix-protobuf-compatibility.patch, TensorFlow-2.13.0_fix-pybind11_protobuf.patch, TensorFlow-2.13.0_remove-io-gcs-filesystem-dep.patch, TensorFlow-2.13.0_remove-libclang-dep.patch, TensorFlow-2.13.0_revert-to-flatbuffers-2.0.6.patch, TensorFlow-2.13.0_unpin-gast-version.patch
@Flamefire Flamefire marked this pull request as draft August 25, 2023 07:57
@boegelbot
Copy link
Collaborator

@Flamefire: Tests failed in GitHub Actions, see https://github.com/easybuilders/easybuild-easyconfigs/actions/runs/5976856309
Output from first failing test suite run:

FAIL: test_dep_versions_per_toolchain_generation (test.easyconfigs.easyconfigs.EasyConfigTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/easybuild-easyconfigs/easybuild-easyconfigs/test/easyconfigs/easyconfigs.py", line 888, in test_dep_versions_per_toolchain_generation
    self.assertFalse(multi_dep_vars, error_msg)
AssertionError: ['protobuf', 'protobuf-python'] is not false : No multi-variant deps found for '^.*-(?P<tc_gen>20(1[89]|[2-9][0-9])[ab]).*\.eb$' easyconfigs:

found 2 variants of 'protobuf' dependency in easyconfigs using '2022a' toolchain generation
* version: 21.9; versionsuffix:  as dep for {'TensorFlow-2.13.0-foss-2022a.eb'}
* version: 3.19.4; versionsuffix:  as dep for {'PyTorch-bundle-1.12.1-foss-2022a-CUDA-11.7.0.eb', 'tensorboardX-2.5.1-foss-2022a.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-PyTorch-1.12.0.eb', 'dorado-0.3.0-foss-2022a-CUDA-11.7.0.eb', 'dorado-0.3.1-foss-2022a-CUDA-11.7.0.eb', 'Cellpose-2.2.2-foss-2022a.eb', 'AlphaFold-2.3.1-foss-2022a-CUDA-11.7.0.eb', 'M3GNet-0.2.4-foss-2022a.eb', 'PyTorch-Geometric-2.1.0-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'PyTorch-1.12.1-foss-2022a.eb', 'RFdiffusion-1.1.0-foss-2022a.eb', 'torchsampler-0.1.2-foss-2022a.eb', 'MONAI-1.0.1-foss-2022a-CUDA-11.7.0.eb', 'captum-0.5.0-foss-2022a.eb', 'segment-anything-1.0-foss-2022a.eb', 'LayoutParser-0.3.4-foss-2022a-CUDA-11.7.0.eb', 'genomepy-0.15.0-foss-2022a.eb', 'torchvf-0.1.3-foss-2022a.eb', 'TensorFlow-2.11.0-foss-2022a.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-PyTorch-1.12.1.eb', 'e3nn-0.3.3-foss-2022a.eb', 'Cellpose-2.2.2-foss-2022a-CUDA-11.7.0.eb', 'chemprop-1.5.2-foss-2022a.eb', 'KerasTuner-1.3.5-foss-2022a.eb', 'rising-0.2.2-foss-2022a.eb', 'Ultralytics-8.0.92-foss-2022a-CUDA-11.7.0.eb', 'Ax-0.3.3-foss-2022a.eb', 'medaka-1.8.1-foss-2022a.eb', 'LayoutParser-0.3.4-foss-2022a.eb', 'chemprop-1.5.2-foss-2022a-CUDA-11.7.0.eb', 'CellOracle-0.12.0-foss-2022a.eb', 'torchvision-0.13.1-foss-2022a.eb', 'torchsampler-0.1.2-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-1.12.1-foss-2022a-CUDA-11.7.0.eb', 'Ray-project-2.2.0-foss-2022a.eb', 'rising-0.2.2-foss-2022a-CUDA-11.7.0.eb', 'pytorch-CycleGAN-pix2pix-20230314-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Lightning-1.7.7-foss-2022a-CUDA-11.7.0.eb', 'CLIP-20230220-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Lightning-1.8.4-foss-2022a-CUDA-11.7.0.eb', 'ColabFold-1.5.2-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-1.13.1-foss-2022a.eb', 'SignalP-6.0g-foss-2022a-fast-CUDA-11.7.0.eb', 'AlphaFold-2.3.4-foss-2022a-ColabFold.eb', 'PyTorch-1.12.0-foss-2022a.eb', 'ColabFold-1.5.2-foss-2022a.eb', 'GimmeMotifs-0.17.2-foss-2022a.eb', 'torchvision-0.13.1-foss-2022a-CUDA-11.7.0.eb', 'POT-0.9.0-foss-2022a.eb', 'PyTorch-Lightning-1.8.4-foss-2022a.eb', 'MONAI-Label-0.5.2-foss-2022a-PyTorch-1.12.0.eb', 'MONAI-Label-0.5.2-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'OmegaFold-1.1.0-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.9.1-foss-2022a.eb', 'PyTorch-1.12.0-foss-2022a-CUDA-11.7.0.eb', 'MONAI-1.0.1-foss-2022a.eb', 'AlphaFold-2.3.4-foss-2022a-CUDA-11.7.0-ColabFold.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-TensorFlow-2.9.1.eb', 'torchaudio-0.12.0-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-TensorFlow-2.11.0.eb', 'PyTorch-Image-Models-0.9.2-foss-2022a-CUDA-11.7.0.eb', 'Omnipose-0.4.4-foss-2022a.eb', 'torchtext-0.14.1-foss-2022a-PyTorch-1.12.0.eb', 'fastai-2.7.10-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.9.1-foss-2022a-CUDA-11.7.0.eb', 'fastai-2.7.10-foss-2022a.eb', 'GenerativeModels-0.2.1-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Image-Models-0.9.2-foss-2022a.eb', 'pyro-ppl-1.8.4-foss-2022a.eb', 'tensorboard-2.10.0-foss-2022a.eb', 'Safetensors-0.3.1-foss-2022a.eb', 'synthcity-0.2.4-foss-2022a.eb', 'AlphaFold-2.3.1-foss-2022a.eb', 'Safetensors-0.3.1-foss-2022a-CUDA-11.7.0.eb', 'captum-0.5.0-foss-2022a-CUDA-11.7.0.eb', 'Casanovo-3.3.0-foss-2022a.eb', 'SignalP-6.0g-foss-2022a-fast.eb', 'Casanovo-3.3.0-foss-2022a-CUDA-11.7.0.eb', 'RLCard-1.0.9-foss-2022a.eb', 'dorado-0.1.1-foss-2022a-CUDA-11.7.0.eb', 'torchaudio-0.12.0-foss-2022a-PyTorch-1.12.0.eb', 'PyTorch-Lightning-1.7.7-foss-2022a.eb', 'timm-0.6.13-foss-2022a-CUDA-11.7.0.eb', 'tensorflow-probability-0.19.0-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.11.0-foss-2022a-CUDA-11.7.0.eb'}

found 2 variants of 'protobuf-python' dependency in easyconfigs using '2022a' toolchain generation
* version: 3.19.4; versionsuffix:  as dep for {'PyTorch-bundle-1.12.1-foss-2022a-CUDA-11.7.0.eb', 'tensorboardX-2.5.1-foss-2022a.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-PyTorch-1.12.0.eb', 'dorado-0.3.0-foss-2022a-CUDA-11.7.0.eb', 'dorado-0.3.1-foss-2022a-CUDA-11.7.0.eb', 'Cellpose-2.2.2-foss-2022a.eb', 'AlphaFold-2.3.1-foss-2022a-CUDA-11.7.0.eb', 'M3GNet-0.2.4-foss-2022a.eb', 'PyTorch-Geometric-2.1.0-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'PyTorch-1.12.1-foss-2022a.eb', 'RFdiffusion-1.1.0-foss-2022a.eb', 'torchsampler-0.1.2-foss-2022a.eb', 'MONAI-1.0.1-foss-2022a-CUDA-11.7.0.eb', 'captum-0.5.0-foss-2022a.eb', 'segment-anything-1.0-foss-2022a.eb', 'LayoutParser-0.3.4-foss-2022a-CUDA-11.7.0.eb', 'genomepy-0.15.0-foss-2022a.eb', 'torchvf-0.1.3-foss-2022a.eb', 'TensorFlow-2.11.0-foss-2022a.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-PyTorch-1.12.1.eb', 'e3nn-0.3.3-foss-2022a.eb', 'Cellpose-2.2.2-foss-2022a-CUDA-11.7.0.eb', 'chemprop-1.5.2-foss-2022a.eb', 'KerasTuner-1.3.5-foss-2022a.eb', 'rising-0.2.2-foss-2022a.eb', 'Ultralytics-8.0.92-foss-2022a-CUDA-11.7.0.eb', 'Ax-0.3.3-foss-2022a.eb', 'medaka-1.8.1-foss-2022a.eb', 'LayoutParser-0.3.4-foss-2022a.eb', 'chemprop-1.5.2-foss-2022a-CUDA-11.7.0.eb', 'CellOracle-0.12.0-foss-2022a.eb', 'torchvision-0.13.1-foss-2022a.eb', 'torchsampler-0.1.2-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-1.12.1-foss-2022a-CUDA-11.7.0.eb', 'Ray-project-2.2.0-foss-2022a.eb', 'rising-0.2.2-foss-2022a-CUDA-11.7.0.eb', 'pytorch-CycleGAN-pix2pix-20230314-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Lightning-1.7.7-foss-2022a-CUDA-11.7.0.eb', 'CLIP-20230220-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Lightning-1.8.4-foss-2022a-CUDA-11.7.0.eb', 'ColabFold-1.5.2-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-1.13.1-foss-2022a.eb', 'SignalP-6.0g-foss-2022a-fast-CUDA-11.7.0.eb', 'AlphaFold-2.3.4-foss-2022a-ColabFold.eb', 'PyTorch-1.12.0-foss-2022a.eb', 'ColabFold-1.5.2-foss-2022a.eb', 'GimmeMotifs-0.17.2-foss-2022a.eb', 'torchvision-0.13.1-foss-2022a-CUDA-11.7.0.eb', 'POT-0.9.0-foss-2022a.eb', 'PyTorch-Lightning-1.8.4-foss-2022a.eb', 'MONAI-Label-0.5.2-foss-2022a-PyTorch-1.12.0.eb', 'MONAI-Label-0.5.2-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'OmegaFold-1.1.0-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.9.1-foss-2022a.eb', 'PyTorch-1.12.0-foss-2022a-CUDA-11.7.0.eb', 'MONAI-1.0.1-foss-2022a.eb', 'AlphaFold-2.3.4-foss-2022a-CUDA-11.7.0-ColabFold.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-TensorFlow-2.9.1.eb', 'torchaudio-0.12.0-foss-2022a-PyTorch-1.12.0-CUDA-11.7.0.eb', 'Horovod-0.28.1-foss-2022a-CUDA-11.7.0-TensorFlow-2.11.0.eb', 'PyTorch-Image-Models-0.9.2-foss-2022a-CUDA-11.7.0.eb', 'Omnipose-0.4.4-foss-2022a.eb', 'torchtext-0.14.1-foss-2022a-PyTorch-1.12.0.eb', 'fastai-2.7.10-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.9.1-foss-2022a-CUDA-11.7.0.eb', 'fastai-2.7.10-foss-2022a.eb', 'GenerativeModels-0.2.1-foss-2022a-CUDA-11.7.0.eb', 'PyTorch-Image-Models-0.9.2-foss-2022a.eb', 'pyro-ppl-1.8.4-foss-2022a.eb', 'tensorboard-2.10.0-foss-2022a.eb', 'Safetensors-0.3.1-foss-2022a.eb', 'synthcity-0.2.4-foss-2022a.eb', 'AlphaFold-2.3.1-foss-2022a.eb', 'Safetensors-0.3.1-foss-2022a-CUDA-11.7.0.eb', 'captum-0.5.0-foss-2022a-CUDA-11.7.0.eb', 'Casanovo-3.3.0-foss-2022a.eb', 'SignalP-6.0g-foss-2022a-fast.eb', 'Casanovo-3.3.0-foss-2022a-CUDA-11.7.0.eb', 'RLCard-1.0.9-foss-2022a.eb', 'dorado-0.1.1-foss-2022a-CUDA-11.7.0.eb', 'torchaudio-0.12.0-foss-2022a-PyTorch-1.12.0.eb', 'PyTorch-Lightning-1.7.7-foss-2022a.eb', 'timm-0.6.13-foss-2022a-CUDA-11.7.0.eb', 'tensorflow-probability-0.19.0-foss-2022a-CUDA-11.7.0.eb', 'TensorFlow-2.11.0-foss-2022a-CUDA-11.7.0.eb'}
* version: 4.21.9; versionsuffix:  as dep for {'TensorFlow-2.13.0-foss-2022a.eb'}


----------------------------------------------------------------------
Ran 17901 tests in 1044.860s

FAILED (failures=1)
ERROR: Not all tests were successful

bleep, bloop, I'm just a bot (boegelbot v20200716.01)
Please talk to my owner @boegel if you notice me acting stupid),
or submit a pull request to https://github.com/boegel/boegelbot fix the problem.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 3 out of 3 (3 easyconfigs in total)
taurusml24 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/d5ff9fac2fd5fa303cfdc20a82201ace for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 4 out of 4 (3 easyconfigs in total)
taurusi8030 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/Flamefire/094b4b13865ac62c192f932ee2bb7fa2 for a full test report.

@Flamefire Flamefire marked this pull request as ready for review August 26, 2023 07:19
@Flamefire
Copy link
Contributor Author

Flamefire commented Aug 26, 2023

TF 2.13 requires 'protobuf>=3.20.3,<5.0.0dev,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5', so the existing 3.19.4 isn't officially compatible.

I could double-check if that still works anyway or we could allow a 2nd protobuf version as we e.g. also have 2 versions for GCCcore-10.2: https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/p/protobuf/protobuf-2.5.0-GCCcore-10.2.0.eb https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/p/protobuf/protobuf-3.14.0-GCCcore-10.2.0.eb and the switch from 3.19 to "v21" is a similar major change

Edit: I tested with the existing protobuf-python 3.19 and import tensorflow fails due to a missing import/symbol added in x.20

@casparvl
Copy link
Contributor

Test report by @casparvl
FAILED
Build succeeded for 5 out of 6 (3 easyconfigs in total)
gcn1.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 520.61.05, Python 3.6.8
See https://gist.github.com/casparvl/610d066777c557b8d25eb32e8065f984 for a full test report.

@boegel boegel added this to the 4.x milestone Aug 30, 2023
@boegel boegel added the update label Aug 30, 2023
@Flamefire
Copy link
Contributor Author

@casparvl

Server terminated abruptly (error code: 14, error message: 'Socket closed', log file: '/gpfs/nvme1/1/casparl/ebbuildpath/TensorFlow/2.13.0/foss-2022a/TensorFlow/bazel-root/5a8230a74ad7cedfe33ddb67e7867fd2/server/jvm.out')

Looks like an environment issue to me, e.g. not enough space, job killed, ...

@boegel boegel modified the milestones: 4.x, next release (4.8.2?) Sep 12, 2023
@easybuilders easybuilders deleted a comment from boegelbot Sep 12, 2023
('nsync', '1.25.0'),
('SQLite', '3.38.3'),
('patchelf', '0.15.0'),
('protobuf-python', '4.21.9'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a new variant of protobuf-python as dependency in the 2022a generation of easyconfigs.

We can add an exception in the test suite for this if needed, but is it worth pursuing this since we already have TensorFlow 2.13.0 with foss/2022b and foss/2023a?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Given I found and worked around the Bazel issue in 2022b+ we could also skip this to avoid possible confusion/mixup with module versions.

@Flamefire
Copy link
Contributor Author

Closing since we have this for 2022b & 2023a

@Flamefire Flamefire deleted the 20230825095718_new_pr_TensorFlow2130 branch April 16, 2024 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants