Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] [1/N] Validate uv options #48479

Merged
merged 9 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions python/ray/_private/runtime_env/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
load("@rules_python//python:defs.bzl", "py_library", "py_test")

package(default_visibility = ["//visibility:public"])

py_library(
name = "validation",
srcs = ["validation.py"],
)

py_test(
name = "validation_test",
srcs = ["validation_test.py"],
tags = ["team:core"],
deps = [
":validation",
],
)
61 changes: 61 additions & 0 deletions python/ray/_private/runtime_env/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,66 @@ def parse_and_validate_conda(conda: Union[str, dict]) -> Union[str, dict]:
return result


# TODO(hjiang): More package installation options to implement:
# 1. Allow users to pass in a local requirements.txt file, which relates to all
# packages to install;
# 2. Allow specific version of `uv` to use; as of now we only use latest version.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default, we should use whatever version that's currently installed?

Copy link
Contributor Author

@dentiny dentiny Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're two possibilities:

  • If we have uv installed in the env already, we should use it without installation; this hasn't been implemented in [core] [2/N] Implement uv processor #48486, but I left a comment
  • If no uv found in the env, we should install the default version, as you mentioned

If user specify a particular version to use, then that's another story.

def parse_and_validate_uv(uv: Union[str, List[str], Dict]) -> Optional[Dict]:
"""Parses and validates a user-provided 'uv' option.

The value of the input 'uv' field can be one of two cases:
1) A List[str] describing the requirements. This is passed through.
Example usage: "packages":["tensorflow", "requests"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Example usage: "packages":["tensorflow", "requests"]
Example usage: "uv":["tensorflow", "requests"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're three options supported in runtime env now:

  • Example: ["requests==1.0.0", "aiohttp", "ray[serve]"]
  • Example: "./requirements.txt"
  • Example: {"packages":["tensorflow", "requests"], "pip_check": False, "pip_version": "==22.0.2;python_version=='3.8.11'"}

I feel remove the key is better, just to reduce confusion on list vs dict, updated.

2) A python dictionary that has three fields:
a) packages (required, List[str]): a list of uv packages, it same as 1).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the other 2 fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand, I'm trying to mimic the documentation for pip.

"""Parses and validates a user-provided 'pip' option.
The value of the input 'pip' field can be one of two cases:
1) A List[str] describing the requirements. This is passed through.
2) A string pointing to a local requirements file. In this case, the
file contents will be read split into a list.
3) A python dictionary that has three fields:
a) packages (required, List[str]): a list of pip packages, it same as 1).
b) pip_check (optional, bool): whether to enable pip check at the end of pip
install, default to False.
c) pip_version (optional, str): the version of pip, ray will spell
the package name 'pip' in front of the `pip_version` to form the final
requirement string, the syntax of a requirement specifier is defined in
full in PEP 508.
The returned parsed value will be a list of pip packages. If a Ray library
(e.g. "ray[serve]") is specified, it will be deleted and replaced by its
dependencies (e.g. "uvicorn", "requests").

Could you please tell me which part am I missing here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we should support pip_check field as well and possibly uv_version field.

Copy link
Contributor Author

@dentiny dentiny Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About uv_version, since the functionality is not implemented yet, I already left a TODO comment at the start of the function. IMO, only implemented feature should be officially documented.
I will definitely document it when feature implemented. Let me know if you're fine with it.

About pip_check, my concern is it doesn't have 100% compatibility between uv and pip, in some cases it would report failure, check out https://github.com/astral-sh/uv/pull/2544/files

One concrete example might be, uv's pip_check warns against multiple versions of a package, while pip version doesn't.
So I'm hesitant whether to implement it or not; but anyway I left a TODO comment at the beginning, so we don't forget to


The returned parsed value will be a list of packages. If a Ray library
(e.g. "ray[serve]") is specified, it will be deleted and replaced by its
dependencies (e.g. "uvicorn", "requests").
"""
assert uv is not None
if sys.platform == "win32":
logger.warning(
"runtime environment support is experimental on Windows. "
"If you run into issues please file a report at "
"https://github.com/ray-project/ray/issues."
)

result: str = ""
if isinstance(uv, list) and all(isinstance(dep, str) for dep in uv):
result = dict(packages=uv)
elif isinstance(uv, dict):
if set(uv.keys()) - {"packages"}:
raise ValueError(
"runtime_env['uv'] can only have these fields: "
"packages, but got: "
f"{list(uv.keys())}"
)
if "packages" not in uv:
raise ValueError(
f"runtime_env['uv'] must include field 'packages', but got {uv}"
)

result = uv.copy()
if not isinstance(uv["packages"], list):
raise ValueError(
"runtime_env['uv']['packages'] must be of type list, "
f"got: {type(uv['packages'])}"
)
else:
raise TypeError(
"runtime_env['uv'] must be of type " f"List[str], got {type(uv)}"
dentiny marked this conversation as resolved.
Show resolved Hide resolved
)

# Deduplicate packages for package lists.
result["packages"] = list(OrderedDict.fromkeys(result["packages"]))

if len(result["packages"]) == 0:
result = None
logger.debug(f"Rewrote runtime_env `uv` field from {uv} to {result}.")
return result


def parse_and_validate_pip(pip: Union[str, List[str], Dict]) -> Optional[Dict]:
"""Parses and validates a user-provided 'pip' option.

Expand Down Expand Up @@ -280,6 +340,7 @@ def parse_and_validate_env_vars(env_vars: Dict[str, str]) -> Optional[Dict[str,
"excludes": parse_and_validate_excludes,
"conda": parse_and_validate_conda,
"pip": parse_and_validate_pip,
"uv": parse_and_validate_uv,
"env_vars": parse_and_validate_env_vars,
"container": parse_and_validate_container,
}
36 changes: 36 additions & 0 deletions python/ray/_private/runtime_env/validation_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
from python.ray._private.runtime_env import validation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have test_runtime_env_validation.py, could you combine them?

Copy link
Contributor Author

@dentiny dentiny Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No plan at the moment, but I could put them in the same folder, if you want.

Let me explain my thoughts:

  • They serve for different purpose. They shouldn't be placed in the same test suite. At the moment, we only have integration test for python features, which relies on the latest current version of ray and test on the whole ray in principle; while I'm writing unit test, whose test target is single function or class.
  • They have different runtime and config. Integration test generally requires external access, while unit test is and requires hermetic and self-contained (bazel test put tests in a sandbox to prevent all external accesses). They usually have different runtime as well, unit tests are small and quick (bazel by default limits a test for 300 sec).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check TestValidatePip it's unit test not integration test.

Copy link
Contributor Author

@dentiny dentiny Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the test suite and test file overall is an integration test, which we import latest version of whole ray.

If you're fine about it, may I move all unit test into a separate unit test file? I guess what you care about is, we should place related test in one single place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Having all unit tests in one place.

Copy link
Contributor Author

@dentiny dentiny Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I move a few test cases into my unit test file, also moved it under tests folder for aggregation.
TODO left for conda related test due to its dependency issue.

import unittest


class TestVaidation(unittest.TestCase):
def test_parse_and_validate_uv(self):
# Valid case w/o duplication.
result = validation.parse_and_validate_uv({"packages": ["tensorflow"]})
self.assertEqual(result, {"packages": ["tensorflow"]})

# Valid case w/ duplication.
result = validation.parse_and_validate_uv(
{"packages": ["tensorflow", "tensorflow"]}
)
self.assertEqual(result, {"packages": ["tensorflow"]})

# Valid case, use `list` to represent necessary packages.
result = validation.parse_and_validate_uv(
["requests==1.0.0", "aiohttp", "ray[serve]"]
)
self.assertEqual(
result, {"packages": ["requests==1.0.0", "aiohttp", "ray[serve]"]}
)

# Invalid case, `str` is not supported for now.
with self.assertRaises(TypeError) as _:
result = validation.parse_and_validate_uv("./requirements.txt")

# Invalid case, unsupport keys.
with self.assertRaises(ValueError) as _:
result = validation.parse_and_validate_uv({"random_key": "random_value"})


if __name__ == "__main__":
unittest.main()