Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[requirements] Split requirements into different tiers [build_base] #36808

Merged
merged 45 commits into from
Jun 30, 2023

Conversation

krfricke
Copy link
Contributor

@krfricke krfricke commented Jun 26, 2023

Why are these changes needed?

This PR splits the requirements files into different tiers:

  • air/core-requirements.txt - these are the most important requirements (tier 1)
  • air/[library]-requirements.txt - these are pinned dependencies for libraries (tier 2+3)
  • air/[library]-test-requirements.txt - these are pinned test-dependencies for libraries (tier 4)

(tier 2 and 3 may get merged into one tier, tbd).

This does not change any packages and should not affect installed packages in docker or CI.

The naming format was chosen to be compatible with dependabot.

In a follow-up PR, we will compile the tier 1-3 dependencies to ensure all CI runners and Docker images run on the same dependency versions.

This PR will split existing requirement files. In total, there will be 4 more such files (rllib-test-requirements.txt, tune-test-requirements.txt, train-test-requirements.txt, docker/ray-docker-requirements.txt

Background

We currently ship different subdependencies in different environments. The packages installed in CI differ between jobs, and the packages in the docker images differ between python versions and image type (ray vs ray-ml).

The reason for that is in those environments only a subset of dependencies are installed. Generally the more dependencies are installed in one go, the more restrictive the subdependencies are and the older the shipped versions. This can lead to a situation where we ship packages in our docker images that are not tested on CI and can actually fail.

The plan going forward is to pip-compile the full list of dependencies into a constraints file. This constraints file is then used in subsequent jobs to install the packages. By utilizing the constraints file we can ensure that the package versions match between CI and docker containers, but we don't have to install all packages in all environments.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Kai Fricke added 8 commits June 26, 2023 10:42
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
@krfricke krfricke marked this pull request as ready for review June 26, 2023 15:07
Signed-off-by: Kai Fricke <kai@anyscale.com>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably doesn't make sense to have all these in an air directory, I wonder if we can just put these directly in requirements instead.

@pcmoritz PTAL and leave any suggestions on how to organize this.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Copy link
Collaborator

@can-anyscale can-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aslonnie
Copy link
Collaborator

I feel that this setup is going to be quite hard to maintain over time.

like if I add a dependency, or require pinning a transient dependency, how do I know / keep track which file I should update?

I think in the long run:

  • ray needs to be properly split into sub packages / libraries. we can still release everything in one, but internally, we need to have clearer module boundaries and layers that is enforced by build and test procedures.
  • based on the package structure, we need to compile a coherently resolved dependency graph for all the dependencies, for each python+platform combination.
  • and then, based on this big dependency graph, we can extract the smaller sub graphs for each module / component, and use the sub graph to build the containers or run the tests.

if we use bazel to compile the dependencies, this is basically what bazel will do. sadly, bazel does not have ver great support for multi-python+multi-platform, or at least it will take a lot of efforts.

that said, the dependency graph compilation for third-party dependencies is basically done by pip, which means we can do it in our own way if we want to.

maybe the challenging part is actually "properly splitting ray into sub components".

Signed-off-by: Kai Fricke <kai@anyscale.com>
@krfricke
Copy link
Contributor Author

I feel that this setup is going to be quite hard to maintain over time.

It's what we're doing right now, but the renaming actually enables e.g. dependabot to make updates easier in the future.

like if I add a dependency, or require pinning a transient dependency, how do I know / keep track which file I should update?

Again no real change here, usually we pin these where the core dependency is defined.

I think in the long run:

* ray needs to be properly split into sub packages / libraries. we can still release everything in one, but internally, we need to have clearer module boundaries and layers that is enforced by build and test procedures.

* based on the package structure, we need to compile a coherently resolved dependency graph for all the dependencies, for each python+platform combination.

* and then, based on this big dependency graph, we can extract the smaller sub graphs for each module / component, and use the sub graph to build the containers or run the tests.

if we use bazel to compile the dependencies, this is basically what bazel will do. sadly, bazel does not have ver great support for multi-python+multi-platform, or at least it will take a lot of efforts.

that said, the dependency graph compilation for third-party dependencies is basically done by pip, which means we can do it in our own way if we want to.

maybe the challenging part is actually "properly splitting ray into sub components".

The goal of these restructurings is not to solve the dependency issues for different ray components. It is mostly about improving the state of our docker containers. We are shipping docker containers, and we ship installed dependencies in them. Users are asking for lists of these dependencies to be able to install them in their own setups, and they expect that these dependency versions are also tested - which is currently not the case.

This renaming and the follow-up (pip-compile) will ensure consistent versions between the docker images and CI. For actual test-only dependencies (in this PR the *-test-requirements.txt files), if we can improve the situation with bazel, that would be great!

Kai Fricke added 4 commits June 27, 2023 10:27
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Kai Fricke added 6 commits June 30, 2023 09:14
# Conflicts:
#	docker/ray-ml/Dockerfile
#	python/requirements/ml/dl-cpu-requirements.txt
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
@krfricke krfricke merged commit 5632819 into ray-project:master Jun 30, 2023
@krfricke krfricke deleted the requirements/tiers branch June 30, 2023 18:30
krfricke added a commit that referenced this pull request Jul 14, 2023
…ld_base] (#36983)

Following up from #36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
krfricke added a commit to krfricke/ray that referenced this pull request Jul 18, 2023
…ld_base] (ray-project#36983)

Following up from ray-project#36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Bhav00 pushed a commit to Bhav00/ray that referenced this pull request Jul 28, 2023
…ld_base] (ray-project#36983)

Following up from ray-project#36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
…ld_base] (ray-project#36983)

Following up from ray-project#36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…ay-project#36808)

This PR splits the requirements files into different tiers:

- `air/core-requirements.txt` - these are the most important requirements (tier 1)
- `air/[library]-requirements.txt` - these are pinned dependencies for libraries (tier 2+3)
- `air/[library]-test-requirements.txt` - these are pinned test-dependencies for libraries (tier 4)

(tier 2 and 3 may get merged into one tier, tbd).

This does not change any packages and should not affect installed packages in docker or CI.

The naming format was chosen to be compatible with dependabot.

In a follow-up PR, we will compile the tier 1-3 dependencies to ensure all CI runners and Docker images run on the same dependency versions.

This PR will split existing requirement files. In total, there will be 4 more such files (`rllib-test-requirements.txt`, `tune-test-requirements.txt`, `train-test-requirements.txt`, `docker/ray-docker-requirements.txt`

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…ld_base] (ray-project#36983)

Following up from ray-project#36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…ld_base] (ray-project#36983)

Following up from ray-project#36808, this PR now uses the compiled constraints file in the installation of the dependencies in CI and in the ray-ml docker images. In result, both environments will share the same versions of (sub)dependencies.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants