Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] [2/N] Implement uv processor #48486

Merged
merged 27 commits into from
Nov 6, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 22 additions & 22 deletions .buildkite/data.rayci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,28 +29,28 @@ steps:

# tests
- label: ":database: data: arrow v9 tests"
tags:
tags:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linter makes all these changes, otherwise I cannot push

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you split the unrelated lint changes to another PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion and I thought about it also, my confusion is, when I tried to push on the master branch, prepush hook doesn't seem to work :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just checkout the changes of these files.

I think the hook is configured to fix changed files only.

the people who added the pre-commit's should have fixed all the lint errors across the entire repo first.. and then run it on CI with -a flag..

- python
- data
instance_type: medium
parallelism: 2
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
--worker-id "$${BUILDKITE_PARALLEL_JOB}" --parallelism-per-worker 3
--build-name data9build
--except-tags data_integration,doctest
depends_on: data9build

- label: ":database: data: arrow v17 tests"
tags:
tags:
- python
- data
instance_type: medium
parallelism: 2
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
--worker-id "$${BUILDKITE_PARALLEL_JOB}" --parallelism-per-worker 3
--build-name datalbuild
--except-tags data_integration,doctest
Expand All @@ -59,7 +59,7 @@ steps:
- label: ":database: data: arrow v17 {{matrix.python}} tests ({{matrix.worker_id}})"
key: datal_python_tests
if: build.pull_request.labels includes "continuous-build" || pipeline.id == "0189e759-8c96-4302-b6b5-b4274406bf89" || pipeline.id == "018f4f1e-1b73-4906-9802-92422e3badaa"
tags:
tags:
- python
- data
instance_type: medium
Expand All @@ -76,21 +76,21 @@ steps:

- label: ":database: data: arrow nightly tests"
if: pipeline.id == "0189e759-8c96-4302-b6b5-b4274406bf89" || pipeline.id == "018f4f1e-1b73-4906-9802-92422e3badaa"
tags:
tags:
- python
- data
instance_type: medium
parallelism: 2
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //python/ray/air/... data
--workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
--worker-id "$${BUILDKITE_PARALLEL_JOB}" --parallelism-per-worker 3
--build-name datanbuild
--except-tags data_integration,doctest
depends_on: datanbuild

- label: ":database: data: TFRecords (tfx-bsl) tests"
tags:
tags:
- python
- data
instance_type: medium
Expand All @@ -102,39 +102,39 @@ steps:
depends_on: datatfxbslbuild

- label: ":database: data: doc tests"
tags:
tags:
- data
- doc
instance_type: medium
commands:
# doc tests
- bazel run //ci/ray_ci:test_in_docker -- python/ray/... //doc/... data
- bazel run //ci/ray_ci:test_in_docker -- python/ray/... //doc/... data
--build-name datalbuild
--except-tags gpu
--only-tags doctest
--parallelism-per-worker 2
# doc examples
- bazel run //ci/ray_ci:test_in_docker -- //doc/... data
- bazel run //ci/ray_ci:test_in_docker -- //doc/... data
--build-name datalbuild
--except-tags gpu,post_wheel_build,doctest
--parallelism-per-worker 2
--skip-ray-installation
depends_on: datalbuild

- label: ":database: data: doc gpu tests"
tags:
tags:
- data
- doc
- gpu
instance_type: gpu-large
commands:
# doc tests
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //doc/... data
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... //doc/... data
--build-name docgpubuild
--only-tags doctest
--except-tags cpu
# doc examples
- bazel run //ci/ray_ci:test_in_docker -- //doc/... data
- bazel run //ci/ray_ci:test_in_docker -- //doc/... data
--build-name docgpubuild
--except-tags doctest
--only-tags gpu
Expand All @@ -147,7 +147,7 @@ steps:
- data
instance_type: medium
commands:
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... data
- bazel run //ci/ray_ci:test_in_docker -- //python/ray/data/... data
--build-name datamongobuild
--build-type java
--only-tags data_integration
Expand All @@ -168,29 +168,29 @@ steps:

- label: ":database: data: flaky tests"
key: data_flaky_tests
tags:
tags:
- python
- data
- skip-on-premerge
instance_type: medium
soft_fail: true
commands:
- bazel run //ci/ray_ci:test_in_docker -- //... data --run-flaky-tests
- bazel run //ci/ray_ci:test_in_docker -- //... data --run-flaky-tests
--parallelism-per-worker 3
--build-name datalbuild
--except-tags gpu_only,gpu
depends_on: datalbuild

- label: ":database: data: flaky gpu tests"
key: data_flaky_gpu_tests
tags:
tags:
- python
- data
- skip-on-premerge
instance_type: gpu-large
soft_fail: true
commands:
- bazel run //ci/ray_ci:test_in_docker -- //... data --run-flaky-tests
- bazel run //ci/ray_ci:test_in_docker -- //... data --run-flaky-tests
--build-name docgpubuild
--only-tags gpu,gpu_only
depends_on: docgpubuild
6 changes: 3 additions & 3 deletions doc/source/ray-observability/ray-distributed-debugger.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Create a file `job.py` with the following snippet. Add `breakpoint()` in the Ray
.. literalinclude:: ./doc_code/ray-distributed-debugger.py
:language: python

Run your Ray app
Run your Ray app
~~~~~~~~~~~~~~~~

Start running your Ray app.
Expand Down Expand Up @@ -98,7 +98,7 @@ Run a Ray task raised exception
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Run the same `job.py` file with an additional argument to raise an exception.

.. code-block:: bash

python job.py raise-exception
Expand All @@ -112,7 +112,7 @@ When the app throws an exception:
- The debugger freezes the task.
- The terminal clearly indicates when the debugger pauses a task and waits for the debugger to attach.
- The paused task is listed in the Ray Debugger extension.
- Click the play icon next to the name of the paused task to attach the debugger and start debugging.
- Click the play icon next to the name of the paused task to attach the debugger and start debugging.

.. image:: ./images/post-moretem.gif
:align: center
Expand Down
42 changes: 41 additions & 1 deletion python/ray/_private/runtime_env/BUILD
Original file line number Diff line number Diff line change
@@ -1,8 +1,48 @@
load("@rules_python//python:defs.bzl", "py_library")
load("@rules_python//python:defs.bzl", "py_library", "py_test")

package(default_visibility = ["//visibility:public"])

py_library(
name = "validation",
srcs = ["validation.py"],
)

py_library(
name = "utils",
srcs = ["utils.py"],
)

py_library(
name = "path_utils",
srcs = ["path_utils.py"],
deps = [
":env_utils",
],
)

py_library(
name = "env_utils",
srcs = ["env_utils.py"],
deps = [
":utils",
],
)

py_library(
name = "dependency_utils",
srcs = ["dependency_utils.py"],
deps = [
":utils",
],
)

py_library(
name = "uv",
srcs= ["uv.py"],
deps = [
":utils",
":env_utils",
":path_utils",
":dependency_utils",
],
)
73 changes: 73 additions & 0 deletions python/ray/_private/runtime_env/dependency_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
"""Util functions to manage dependency requirements."""

from typing import List, Tuple
import os
import tempfile
import logging
from contextlib import asynccontextmanager
from ray._private.runtime_env import env_utils
from ray._private.runtime_env.utils import check_output_cmd


def dump_requirements_txt(requirements_file: str, pip_packages: List[str]):
"""Dump [pip_packages] to the given [requirements_file] for later env setup."""
with open(requirements_file, "w") as file:
for line in pip_packages:
file.write(line + "\n")


@asynccontextmanager
async def check_ray(python: str, cwd: str, logger: logging.Logger):
"""A context manager to check ray is not overwritten.

Currently, we only check ray version and path. It works for virtualenv,
- ray is in Python's site-packages.
- ray is overwritten during yield.
- ray is in virtualenv's site-packages.
"""

async def _get_ray_version_and_path() -> Tuple[str, str]:
with tempfile.TemporaryDirectory(
prefix="check_ray_version_tempfile"
) as tmp_dir:
ray_version_path = os.path.join(tmp_dir, "ray_version.txt")
check_ray_cmd = [
python,
"-c",
"""
import ray
with open(r"{ray_version_path}", "wt") as f:
f.write(ray.__version__)
f.write(" ")
f.write(ray.__path__[0])
""".format(
ray_version_path=ray_version_path
),
]
if env_utils._WIN32:
env = os.environ.copy()
else:
env = {}
output = await check_output_cmd(
check_ray_cmd, logger=logger, cwd=cwd, env=env
)
logger.info(f"try to write ray version information in: {ray_version_path}")
with open(ray_version_path, "rt") as f:
output = f.read()
# print after import ray may have  endings, so we strip them by *_
ray_version, ray_path, *_ = [s.strip() for s in output.split()]
return ray_version, ray_path

version, path = await _get_ray_version_and_path()
yield
actual_version, actual_path = await _get_ray_version_and_path()
if actual_version != version or actual_path != path:
raise RuntimeError(
"Changing the ray version is not allowed: \n"
f" current version: {actual_version}, "
f"current path: {actual_path}\n"
f" expect version: {version}, "
f"expect path: {path}\n"
"Please ensure the dependencies in the runtime_env pip field "
"do not install a different version of Ray."
)
85 changes: 85 additions & 0 deletions python/ray/_private/runtime_env/env_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""Utils to detect runtime environment."""

dentiny marked this conversation as resolved.
Show resolved Hide resolved
import sys
from ray._private.runtime_env.utils import check_output_cmd
import logging
import os

_WIN32 = os.name == "nt"


def is_in_virtualenv() -> bool:
# virtualenv <= 16.7.9 sets the real_prefix,
# virtualenv > 16.7.9 & venv set the base_prefix.
# So, we check both of them here.
# https://github.com/pypa/virtualenv/issues/1622#issuecomment-586186094
return hasattr(sys, "real_prefix") or (
hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix
)


async def create_or_get_virtualenv(path: str, cwd: str, logger: logging.Logger):
"""Create or get a virtualenv from path."""
python = sys.executable
virtualenv_path = os.path.join(path, "virtualenv")
virtualenv_app_data_path = os.path.join(path, "virtualenv_app_data")

if _WIN32:
current_python_dir = sys.prefix
env = os.environ.copy()
else:
current_python_dir = os.path.abspath(
os.path.join(os.path.dirname(python), "..")
)
env = {}

if is_in_virtualenv():
# virtualenv-clone homepage:
# https://github.com/edwardgeorge/virtualenv-clone
# virtualenv-clone Usage:
# virtualenv-clone /path/to/existing/venv /path/to/cloned/ven
# or
# python -m clonevirtualenv /path/to/existing/venv /path/to/cloned/ven
clonevirtualenv = os.path.join(os.path.dirname(__file__), "_clonevirtualenv.py")
create_venv_cmd = [
python,
clonevirtualenv,
current_python_dir,
virtualenv_path,
]
logger.info("Cloning virtualenv %s to %s", current_python_dir, virtualenv_path)
else:
# virtualenv options:
# https://virtualenv.pypa.io/en/latest/cli_interface.html
#
# --app-data
# --reset-app-data
# Set an empty seperated app data folder for current virtualenv.
#
# --no-periodic-update
# Disable the periodic (once every 14 days) update of the embedded
# wheels.
#
# --system-site-packages
# Inherit site packages.
#
# --no-download
# Never download the latest pip/setuptools/wheel from PyPI.
create_venv_cmd = [
python,
"-m",
"virtualenv",
"--app-data",
virtualenv_app_data_path,
"--reset-app-data",
"--no-periodic-update",
"--system-site-packages",
"--no-download",
virtualenv_path,
]
logger.info(
"Creating virtualenv at %s, current python dir %s",
virtualenv_path,
virtualenv_path,
)
await check_output_cmd(create_venv_cmd, logger=logger, cwd=cwd, env=env)
Loading