-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In memory blob object #78
Conversation
…ronments). ObjectStore now holds a Ray actor which wraps a kv dict, which allows any ray process to access the kv dict. 2) Fix bug in cluster factory 3) Allow HTTP server to be started inside a conda env (needs more testing) 4) Remove deprecated pkg_resources usage
…bclass. .write() is no longer needed to save down, and we now handle serialization during file blob's .fetch and .write. 2) Use in-memory blob in object store tests instead of pinning. Obj store tests pass. 3) Start refactoring blob_tests. First few pass. 4) Only start ray in obj store if it's installed successfully (need to do this elsewhere too) 5)
2) Move default name generation into new util _generate_default_name function. Need to migrate folder and table to use this. 3) Organize utils a bit more.
…e tests pass. Next is making obj_store a dict again and unfucking run_module_utils.
runhouse/rns/blobs/blob.py
Outdated
if self.system.on_this_cluster(): | ||
obj_store.delete(self.name) | ||
else: | ||
self.system.delete(self.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to delete the blob from the cluster's local file system? We don't have a delete
method for the cluster which does this (i think previously this went through the blob's folder
, how can we do this now that a blob doesn't live in a folder object?
…out in the Run context manager
# Conflicts: # README.md
# Conflicts: # runhouse/rns/run_module_utils.py # runhouse/servers/http/http_server.py
…ethod on a blob in the blob store. Obj_store tests all pass.
…tall" method. All non-conda env tests pass and basic function test passes.
# Conflicts: # runhouse/rns/obj_store.py
…ts all pass, but streaming logs doesn't work because logs are written to stdout after function completes.
…y object store, as well as a ray-based dict actor for x-env key lookup. - Modify function `.to` to put function on the cluster, and modify __call__ path to call function on cluster via call_module_method. - Fix dryrun bugs with putting function on cluster. - Add simple numpy-based test to evaluate whether pinning keeps object in python memory. Test passes. - Add simple env install caching - Update Runs to use new blob behavior. - Add test for stateful_generator to emulate LLM, not yet working.
# Conflicts: # runhouse/rns/envs/env.py # runhouse/rns/function.py # runhouse/rns/hardware/cluster_factory.py
- Add streaming results for generator function. - Introduce Queue and KVStore resources - Add `load` option for blob not to check RNS for key Test stream_logs, test_pinning_in_memory, test_put_resource, test_stateful_generator and test_function.test_generator all pass, but streaming logs doesn't work property for generators.
# Conflicts: # runhouse/rns/defaults.py # runhouse/rns/top_level_rns_fns.py
…t rh.Modules. KVStore and Queue need more testing. Also, right now we're not sending state over when we send the resources to the cluster. - Move logic for local vs. remote execution mostly within Module and Cluster. - Changed function not to rely on run_module utils for __call__, remote, and run, and so far so good. Most obj_store tests work (ones that rely on .run do not). Logging is broken though. - remote looks like it works but needs more testing. - Added `provenance` field to Resource for holding Run info. - Allow obj_store to support `put` across servlets
…ass) and MyClass(rh.Module). All module tests pass with streaming, both local=True and local=False, and property fetching. - Support passing state when putting resources on a cluster. - Support fetching properties (private and public) and complete Module through .fetch method. Fix support for private methods in __getattribute__. - Make sure working_dir is synced for locally defined Modules. - Introduce `remote_init` for easier specification of remote setup and saving a hop.
…arios, and add tests to ensure streaming logs works. All module tests pass, and most function and obj_store tests pass (other than .remote and .map related tests).
- Change .run behavior to be async but return a run string. - Update Blob to be a Module. - Allow cluster.get to return a remote. - Stop calling mkdir within Folder constructor, call within put instead. - Introduce `rh.here` as a way to get current cluster. All obj_store tests pass except cancelling. All module tests pass. Most function tests pass except other function types (map, queue, etc.), cancelling, and http url. All cluster tests pass.
# Conflicts: # runhouse/rns/envs/env.py # tests/conftest.py # tests/test_blob.py # tests/test_env.py # tests/test_function.py # tests/test_obj_store.py
- Get rid of `install` dedicated rpc, and make cluster.install_packages flow through `env.to` - Add support for returning a queue when calling a generator with .remote to stream back results. Module and cluster tests pass.
# Conflicts: # runhouse/__init__.py
Merging this into main for now, but still more cleanup to do. |
Servlets / Object Store RpCs
rh.here
interfaceModules
obj.fetch.property
method
if we detect thatmethod
is a coroutine (via @isaacrob)fn.save(new_name)
needs to rename the resource in the cluster's kvstore..remote
return an actual remote Module, and.run
return the runkey async.remote
for stream to return a Queue while results are still being generated so user can.get
each new resultrh.Queue
to support streaming across Ray boundary.get
without popping value (subsequent PR)rh.KVStore
to support Actor or non-actor KVsfunhouse
Provenance
provenance
property inside ResourcesCluster
system.run
and RpCsrh.here
to return Cluster as primary way for users to interact with Object Storerh.here
is available in a python interpreter on the clustercluster.call
to call module methods in a python interpreter (e.g. for debugging)cluster.contents
to list keys and obj types.remote
(which will contain a new cluster object)Cleanup
funhouse
with new APIs