-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[build_base] [Tune] Add more comprehensive support for remote upload_dir
w/ endpoint and params
#32479
Merged
krfricke
merged 18 commits into
ray-project:master
from
justinvyu:tune/remote_checkpoint_dir_with_endpoint
Feb 15, 2023
Merged
[build_base] [Tune] Add more comprehensive support for remote upload_dir
w/ endpoint and params
#32479
krfricke
merged 18 commits into
ray-project:master
from
justinvyu:tune/remote_checkpoint_dir_with_endpoint
Feb 15, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
…Server Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
…/remote_checkpoint_dir_with_endpoint
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
…/remote_checkpoint_dir_with_endpoint
Yard1
approved these changes
Feb 13, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice!
justinvyu
changed the title
[Tune] Add more comprehensive support for remote
[build_base] [Tune] Add more comprehensive support for remote Feb 13, 2023
upload_dir
w/ endpoint and paramsupload_dir
w/ endpoint and params
…/remote_checkpoint_dir_with_endpoint
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
…/remote_checkpoint_dir_with_endpoint
krfricke
approved these changes
Feb 14, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Let's run cloud tests
justinvyu
added a commit
to justinvyu/ray
that referenced
this pull request
Feb 15, 2023
… `upload_dir` w/ endpoint and params (ray-project#32479)" This reverts commit 2f2080f.
justinvyu
added a commit
to justinvyu/ray
that referenced
this pull request
Feb 15, 2023
…r remote `upload_dir` w/ endpoint and params (ray-project#32479)" (ray-project#32571)" This reverts commit 350fb13.
krfricke
pushed a commit
that referenced
this pull request
Feb 16, 2023
…_dir w/ url params (#32479)" (#32576) This reverts the revert made here #32571. This PR fixes two suites of tests that were failing: 1. workflows tests were failing due to the pinned `moto` version being too old. 2. `test_syncer` errors were caused by reusing the same bucket name between tests. This PR also fixes the `restore` path for URIs with params, which was missing from the previous PR. The e2e sync test has been updated to also test restoration. Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
edoakes
pushed a commit
to edoakes/ray
that referenced
this pull request
Mar 22, 2023
…_dir` w/ endpoint and params (ray-project#32479) Currently, URI handling with parameters is done in multiple places in different ways (using `urllib.parse` or splitting by `'?'` directly). In some places, it's not done at all, which **causes errors when performing cloud checkpointing.** In particular, `Trial.remote_checkpoint_dir` and `Trainable._storage_path` do not handle URI path appends correctly when URL params are present. Signed-off-by: Justin Yu <justinvyu@berkeley.edu> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
edoakes
pushed a commit
to edoakes/ray
that referenced
this pull request
Mar 22, 2023
… `upload_dir` w/ endpoint and params (ray-project#32479)" (ray-project#32571) This reverts commit 2f2080f. Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
edoakes
pushed a commit
to edoakes/ray
that referenced
this pull request
Mar 22, 2023
…_dir w/ url params (ray-project#32479)" (ray-project#32576) This reverts the revert made here ray-project#32571. This PR fixes two suites of tests that were failing: 1. workflows tests were failing due to the pinned `moto` version being too old. 2. `test_syncer` errors were caused by reusing the same bucket name between tests. This PR also fixes the `restore` path for URIs with params, which was missing from the previous PR. The e2e sync test has been updated to also test restoration. Signed-off-by: Justin Yu <justinvyu@berkeley.edu> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
peytondmurray
pushed a commit
to peytondmurray/ray
that referenced
this pull request
Mar 22, 2023
… `upload_dir` w/ endpoint and params (ray-project#32479)" (ray-project#32571) This reverts commit 2f2080f.
peytondmurray
pushed a commit
to peytondmurray/ray
that referenced
this pull request
Mar 22, 2023
…_dir w/ url params (ray-project#32479)" (ray-project#32576) This reverts the revert made here ray-project#32571. This PR fixes two suites of tests that were failing: 1. workflows tests were failing due to the pinned `moto` version being too old. 2. `test_syncer` errors were caused by reusing the same bucket name between tests. This PR also fixes the `restore` path for URIs with params, which was missing from the previous PR. The e2e sync test has been updated to also test restoration. Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
elliottower
pushed a commit
to elliottower/ray
that referenced
this pull request
Apr 22, 2023
…_dir` w/ endpoint and params (ray-project#32479) Currently, URI handling with parameters is done in multiple places in different ways (using `urllib.parse` or splitting by `'?'` directly). In some places, it's not done at all, which **causes errors when performing cloud checkpointing.** In particular, `Trial.remote_checkpoint_dir` and `Trainable._storage_path` do not handle URI path appends correctly when URL params are present. Signed-off-by: Justin Yu <justinvyu@berkeley.edu> Signed-off-by: elliottower <elliot@elliottower.com>
elliottower
pushed a commit
to elliottower/ray
that referenced
this pull request
Apr 22, 2023
… `upload_dir` w/ endpoint and params (ray-project#32479)" (ray-project#32571) This reverts commit 2f2080f. Signed-off-by: elliottower <elliot@elliottower.com>
elliottower
pushed a commit
to elliottower/ray
that referenced
this pull request
Apr 22, 2023
…_dir w/ url params (ray-project#32479)" (ray-project#32576) This reverts the revert made here ray-project#32571. This PR fixes two suites of tests that were failing: 1. workflows tests were failing due to the pinned `moto` version being too old. 2. `test_syncer` errors were caused by reusing the same bucket name between tests. This PR also fixes the `restore` path for URIs with params, which was missing from the previous PR. The e2e sync test has been updated to also test restoration. Signed-off-by: Justin Yu <justinvyu@berkeley.edu> Signed-off-by: elliottower <elliot@elliottower.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
Problem
Currently, URI handling with parameters is done in multiple places in different ways (using
urllib.parse
or splitting by'?'
directly). In some places, it's not done at all, which causes errors when performing cloud checkpointing. In particular,Trial.remote_checkpoint_dir
andTrainable._storage_path
do not handle URI path appends correctly when URL params are present.Example:
Solution
This PR introduces a
URI
utility class that appends/gets the parent/name of a URI. All of the URI handling now goes through this utility, rather than doingos.path.join
/manually parsing the string.This PR also improves the moto mock s3
simulate_storage
helper, and introduces a fixture intest_syncer
and example usage in an end to end Tune run that saves to the cloud.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.