Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch>=2.0 Unittest support #216

Merged
merged 11 commits into from
Aug 30, 2024
Merged

Torch>=2.0 Unittest support #216

merged 11 commits into from
Aug 30, 2024

Conversation

ktsitsi
Copy link
Collaborator

@ktsitsi ktsitsi commented Aug 27, 2024

This PR:

  • Extends the CI test suite by introducing tests for torch>=2.0 for all python versions.

This can unblock any pending work that requires the aforementioned dependency.

Copy link
Contributor

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are touching this file, should update the action versions as they are generating warnings.

The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/

The latest are:

  • actions/checkout@v4
  • actions/setup-python@v5
  • actions/cache@v4

Copy link
Contributor

@bkmartinjr bkmartinjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add a version that runs on a more modern Python and PyTorch?

Suggest Python 3.12, Torch ~2.4 and torchdata 0.8 (latest)

@bkmartinjr
Copy link
Contributor

Digging through the logs for the 2.0 run there are a error messages, but I'm not sure any of them matter.

2024-08-27T10:13:20.6013473Z   DEPRECATION: Legacy editable install of tiledb-ml[cloud]==0.1.dev1 from file:///home/runner/work/TileDB-ML/TileDB-ML (setup.py develop) is deprecated. pip 25.0 will enforce this behaviour change. A possible replacement is to add a pyproject.toml or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as expected, try using --config-settings editable_mode=compat. Please consult the setuptools documentation for more information. Discussion can be found at https://github.com/pypa/pip/issues/11457
2024-08-27T10:13:20.6017632Z   Running setup.py develop for tiledb-ml
2024-08-27T10:13:21.7079650Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
2024-08-27T10:13:21.7081668Z tensorflow-cpu 2.13.0 requires numpy<=1.24.3,>=1.22, but you have numpy 2.0.2 which is incompatible.

@ktsitsi
Copy link
Collaborator Author

ktsitsi commented Aug 28, 2024

Digging through the logs for the 2.0 run there are a error messages, but I'm not sure any of them matter.

2024-08-27T10:13:20.6013473Z   DEPRECATION: Legacy editable install of tiledb-ml[cloud]==0.1.dev1 from file:///home/runner/work/TileDB-ML/TileDB-ML (setup.py develop) is deprecated. pip 25.0 will enforce this behaviour change. A possible replacement is to add a pyproject.toml or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as expected, try using --config-settings editable_mode=compat. Please consult the setuptools documentation for more information. Discussion can be found at https://github.com/pypa/pip/issues/11457
2024-08-27T10:13:20.6017632Z   Running setup.py develop for tiledb-ml
2024-08-27T10:13:21.7079650Z ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
2024-08-27T10:13:21.7081668Z tensorflow-cpu 2.13.0 requires numpy<=1.24.3,>=1.22, but you have numpy 2.0.2 which is incompatible.

For the pytorch implementation doesn't matter but I guess since we fully install the tiledb-ml package inside the UDF images this will create some issues with the tensorflow distribution. The error makes sense since pytorch>=2.0 and especially torchvision seems to require numpy>=2.0 where the latest version we support for TF is 2.13 is conflicting with that since it has an upper bound. We tried to increase the supported versions for TF but it will require some time/capacity. Interestingly the issue you pointed out appears only for PY-3.9. I will take a further look in case we wanna make sure that nothing breaks.

@bkmartinjr
Copy link
Contributor

The error makes sense since pytorch>=2.0 and especially torchvision seems to require numpy>=2.0

If you use one release back, I think it works:

pip install -f https://download.pytorch.org/whl/torch_stable.html protobuf==3.* torch==2.3.1+cpu torchvision==0.18.1+cpu torchdata torchaudio==2.3.1 tensorflow-cpu 'numpy<=2'

This gets me numpy 1.26.4 without complaint

@ktsitsi ktsitsi force-pushed the kt/experiment-with-torch-2 branch 3 times, most recently from 436c669 to 7354415 Compare August 29, 2024 08:55
@bkmartinjr
Copy link
Contributor

Looking at the latest fail, the tensorflow-cpu pin is too low. In my succesfull install, I believe I used an unpinned tensorflow-cpu, which ended up installing 2.17.0

@ktsitsi ktsitsi merged commit 8a1bf8d into master Aug 30, 2024
10 checks passed
@ktsitsi ktsitsi deleted the kt/experiment-with-torch-2 branch August 30, 2024 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants