Skip to content

Releases: ray-project/ray

Ray-2.35.0

28 Aug 00:11
c5d536d
Compare
Choose a tag to compare

Notice: Starting from this release, pip install ray[all] will not include ray[cpp], and will not install the respective ray-cpp package. To install everything that includes ray-cpp, one can use pip install ray[cpp-all] instead.

Ray Libraries

Ray Data

🎉 New Features:

  • Upgrade supported Arrow version from 16 to 17 (#47034)
  • Add support for reading from Iceberg (#46889)

💫 Enhancements:

  • Various Progress Bar UX improvements (#46816, #46801, #46826, #46692, #46699, #46974, #46928, #47029, #46924, #47120, #47095, #47106)
  • Try get size_bytes from metadata and consolidate metadata methods (#46862)
  • Improve warning message when read task is large (#46942)
  • Extend API to enable passing sample weights via ray.dataset.to_tf (#45701)
  • Add a parameter to allow overriding LanceDB scanner options (#46975)
  • Add failure retry logic for read_lance (#46976)
  • Clarify warning for reading old Parquet data (#47049)
  • Move datasource implementations to _internal subpackage (#46825)
  • Handle logs from tensor extensions (#46943)

🔨 Fixes:

  • Change type of DataContext.retried_io_errors from tuple to list (#46884)
  • Make Parquet tests more robust and expose Parquet logic (#46944)
  • Change pickling log level from warning to debug (#47032)
  • Add validation for shuffle arg (#47055)
  • Fix validation bug when size=0 in ActorPoolStrategy (#47072)
  • Fix exception in async map (#47110)
  • Fix wrong metrics group for Object Store Memory metrics on Ray Data Dashboard (#47170)
  • Handle errors in SplitCoordinator when generating a new epoch (#47176)

📖 Documentation:

  • Auto-gen GroupedData api (#46925)
  • Fix signature of Rule.plan (#47094)

Ray Train

💫 Enhancements:

  • [train] Updates to support xgboost==2.1.0 (#46667)
  • [train] Add hardware stats (#46719)

Ray Tune

🔨 Fixes:

  • [RLlib; Tune] Fix WandB metric overlap after restore from checkpoint. (#46897)

Ray Serve

💫 Enhancements:

  • Improved handling of replica death and replica unavailability in deployment handle routers before controller restarts replica (#47008)
  • Eagerly create routers in proxy for better GCS fault tolerance (#47031)
  • Immediately send ping in router when receiving new replica set (#47053)

🏗 Architecture refactoring:

  • Deprecate passing arguments that contain DeploymentResponses in nested objects to downstream deployment handle calls (#46806)

RLlib

🎉 New Features:

💫 Enhancements:

  • Add ObservationPreprocessor (ConnectorV2). (#47077)

🔨 Fixes:

📖 Documentation:

  • Example scripts for new API stack:
    • Curiosity (inverse dynamics model-based) RLModule example. (#46841)
    • Add example script for Env with protobuf observation space. (#47071)
  • New API stack documentation:
    • Cleanup old API stack docs (rllib-dev.rst). (#47172)
    • Episodes (SingleAgentEpisode). (#46985)
    • Redo rllib-algorithms.rst page. (#46916)

🏗 Architecture refactoring:

  • Rename MultiAgent...RLModule... into MultiRL...Module for more generality. (#46840)
  • Add learner_only flag to RLModuleConfig/Spec and simplify creation of RLModule specs from algo-config. (#46900)

Ray Core

💫 Enhancements:

  • Emit total lineage bytes metrics (#46725)
  • Adding accelerator type H100 (#46823)
  • More structured logging in core worker (#46906)
  • Change all callbacks to move to save copies. (#46971)
  • Add ray[adag] option to pip install (#47009)

🔨 Fixes:

  • Fix dashboard process reporting on windows (#45578)
  • Fix Ray-on-Spark cluster crashing bug when user cancels cell execution (#46899)
  • Fix PinExistingReturnObject segfault by passing owner_address (#46973)
  • Fix raylet CHECK failure from runtime env creation failure. (#46991)
  • Fix typo in memray command (#47006)
  • [ADAG] Fix for asyncio outputs (#46845)

📖 Documentation:

  • Clarify behavior of placement_group_capture_child_tasks in docs (#46885)
  • Update ray.available_resources() docstring (#47018)

🏗 Architecture refactoring:

  • Async APIs for the New GcsClient. (#46788)
  • Replace GCS stubs in the dashboard to use NewGcsAioClient. (#46846)

Dashboard

💫 Enhancements:

  • Polish and minor improvements to the Serve page (#46811)

🔨 Fixes:

  • Fix CPU/GPU/RAM not being reported correctly on Windows (#44578)

Docs

💫 Enhancements:

  • Add more information about developer tooling for docs contributions (#46636), including esbonio section

🔨 Fixes:

  • Use PyData Sphinx theme version switcher (#46936)

Thanks

Many thanks to all those who contributed to this release!
@simonsays1980, @bveeramani, @tungh2, @zcin, @xingyu-long, @WeichenXu123, @aslonnie, @MaxVanDijck, @can-anyscale, @galenhwang, @omatthew98, @matthewdeng, @raulchen, @sven1977, @shrekris-anyscale, @deepyaman, @alexeykudinkin, @stephanie-wang, @kevin85421, @ruisearch42, @hongchaodeng, @khluu, @alanwguo, @hongpeng-guo, @saihaj, @Superskyyy, @tespent, @slfan1989, @justinvyu, @rynewang, @nikitavemuri, @amogkam, @mattip, @dev-goyal, @ryanaoleary, @peytondmurray, @edoakes, @venkatajagannath, @jjyao, @cristianjd, @scottjlee, @Bye-legumes

Release 2.34.0 Notes

31 Jul 18:02
fc87217
Compare
Choose a tag to compare

Ray Libraries

Ray Data

💫 Enhancements:

  • Add better support for UDF returns from list of datetime objects (#46762)

🔨 Fixes:

  • Remove read task warning if size bytes not set in metadata (#46765)

📖 Documentation:

  • Fix read_tfrecords() docstring to display tfx-bsl tip (#46717)
  • Update Dataset.zip() docs (#46757)

Ray Train

🔨 Fixes:

  • Sort workers by node ID rather than by node IP (#46163)

🏗 Architecture refactoring:

  • Remove dead RayDatasetSpec (#46764)

RLlib

🎉 New Features:

  • Offline RL support on new API stack:
    • Initial design for Ray-Data based offline RL Algos (on new API stack). (#44969)
    • Add user-defined schemas for data loading. (#46738)
    • Make data pipeline better configurable and tuneable for users. (#46777)

💫 Enhancements:

  • Move DQN into the TargetNetworkAPI (and deprecate RLModuleWithTargetNetworksInterface). (#46752)

🔨 Fixes:

  • Numpy version fix: Rename all np.product usage to np.prod (#46317)

📖 Documentation:

  • Examples for new API stack: Add 2 (count-based) curiosity examples. (#46737)
  • Remove RLlib CLI from docs (soon to be deprecated and replaced by python API). (#46724)

🏗 Architecture refactoring:

  • Cleanup, rename, clarify: Algorithm.workers/evaluation_workers, local_worker(), etc.. (#46726)

Ray Core

🏗 Architecture refactoring:

  • New python GcsClient binding (#46186)

Many thanks to all those who contributed to this release! @KyleKoon, @ruisearch42, @rynewang, @sven1977, @saihaj, @aslonnie, @bveeramani, @akshay-anyscale, @kevin85421, @omatthew98, @anyscalesam, @MaxVanDijck, @justinvyu, @simonsays1980, @can-anyscale, @peytondmurray, @scottjlee

Ray-2.33.0

25 Jul 20:28
914af09
Compare
Choose a tag to compare

Ray Libraries

Ray Core

💫 Enhancements:

  • Add "last exception" to error message when GCS connection fails in ray.init() (#46516)

🔨 Fixes:

  • Add object back to memory store when object recovery is skipped (#46460)
  • Task status should start with PENDING_ARGS_AVAIL when retry (#46494)
  • Fix ObjectFetchTimedOutError (#46562)
  • Make working_dir support files created before 1980 (#46634)
  • Allow full path in conda runtime env. (#45550)
  • Fix worker launch time formatting in state api (#43516)

Ray Data

🎉 New Features:

  • Deprecate Dataset.get_internal_block_refs() (#46455)
  • Add read API for reading Databricks table with Delta Sharing (#46072)
  • Add support for objects to Arrow blocks (#45272)

💫 Enhancements:

  • Change offsets to int64 and change to LargeList for ArrowTensorArray (#45352)
  • Prevent from_pandas from combining input blocks (#46363)
  • Update Dataset.count() to avoid unnecessarily keeping BlockRefs in-memory (#46369)
  • Use Set to fix inefficient iteration over Arrow table columns (#46541)
  • Add AWS Error UNKNOWN to list of retried write errors (#46646)
  • Always print traceback for internal exceptions (#46647)
  • Allow unknown estimate of operator output bundles and ProgressBar totals (#46601)
  • Improve filesystem retry coverage (#46685)

🔨 Fixes:

  • Replace lambda mutable default arguments (#46493)

📖 Documentation:

  • Auto-generate Dataset API documentation (#46557)
  • Update outdated ExecutionPlan docstring (#46638)

Ray Train

💫 Enhancements:

  • Update run status and actor status for train runs. (#46395)

🔨 Fixes:

  • Replace lambda default arguments (#46576)

📖 Documentation:

  • Add MNIST training using KubeRay doc page (#46123)
  • Add example of pre-training Llama model on Intel Gaudi (#45459)
  • Fix tensorflow example by using ScalingConfig (#46565)

Ray Tune

🔨 Fixes:

  • Replace lambda default arguments (#46596)

Ray Serve

🎉 New Features:

  • Fully deprecate target_num_ongoing_requests_per_replica and max_concurrent_queries, respectively replaced by max_ongoing_requests and target_ongoing_requests (#46392 and #46427)
  • Configure the task launched by the controller to build an application with Serve’s logging config (#46347)

RLlib

💫 Enhancements:

  • Moving sampling coordination for batch_mode=complete_episodes to synchronous_parallel_sample. (#46321)
  • Enable complex action spaces with stateful modules. (#46468)

🏗 Architecture refactoring:

  • Enable multi-learner setup for hybrid stack BC. (#46436)
  • Introduce Checkpointable API for RLlib components and subcomponents. (#46376)

🔨 Fixes:

  • Replace Mapping typehint with Dict: #46474

📖 Documentation:

  • More example scripts for new API stack: Two separate optimizers (w/ different learning rates). (#46540) and custom loss function. (#46445)

Dashboard

🔨 Fixes:

  • Task end time showing the incorrect time (#46439)
  • Events Table rows having really bad spacing (#46701)
  • UI bugs in the serve dashboard page (#46599)

Thanks

Many thanks to all those who contributed to this release!

@alanwguo, @hongchaodeng, @anyscalesam, @brucebismarck, @bt2513, @woshiyyya, @terraflops1048576, @lorenzoritter, @omrishiv, @davidxia, @cchen777, @nono-Sang, @jackhumphries, @aslonnie, @JoshKarpel, @zjregee, @bveeramani, @khluu, @Superskyyy, @liuxsh9, @jjyao, @ruisearch42, @sven1977, @harborn, @saihaj, @zcin, @can-anyscale, @veekaybee, @chungen04, @WeichenXu123, @GeneDer, @sergey-serebryakov, @Bye-legumes, @scottjlee, @rynewang, @kevin85421, @cristianjd, @peytondmurray, @MortalHappiness, @MaxVanDijck, @simonsays1980, @mjovanovic9999

Ray-2.32.0

10 Jul 16:40
607f2f3
Compare
Choose a tag to compare

Highlight: aDAG Developer Preview

This is a new Ray Core specific feature called Ray accelerated DAGs (aDAGs).

  • aDAGs give you a Ray Core-like API but with extensibility to pre-compile execution paths across pre-allocated resources on a Ray Cluster to possible benefits for optimization on throughput and latency. Some practical examples include:
    • Up to 10x lower task execution time on single-node.
    • Native support for GPU-GPU communication, via NCCL.
  • This is still very early, but please reach out on #ray-core on Ray Slack to learn more!

Ray Libraries

Ray Data

💫 Enhancements:

  • Support async callable classes in map_batches() (#46129)

🔨 Fixes:

  • Ensure InputDataBuffer doesn't free block references (#46191)
  • MapOperator.num_active_tasks should exclude pending actors (#46364)
  • Fix progress bars being displayed as partially completed in Jupyter notebooks (#46289)

📖 Documentation:

  • Fix docs: read_api.py docstring (#45690)
  • Correct API annotation for tfrecords_datasource (#46171)
  • Fix broken links in README and in ray.data.Dataset (#45345)

Ray Train

📖 Documentation:

  • Update PyTorch Data Ingestion User Guide (#45421)

Ray Serve

💫 Enhancements:

  • Optimize ServeController.get_app_config() (#45878)
  • Change default for max and target ongoing requests (#45943)
  • Integrate with Ray structured logging (#46215)
  • Allow configuring handle cache size and controller max concurrency (#46278)
  • Optimize DeploymentDetails.deployment_route_prefix_not_set() (#46305)

RLlib

🎉 New Features:

  • APPO on new API stack (w/ EnvRunners). (#46216)

💫 Enhancements:

  • Stability: APPO, SAC, and DQN activate multi-agent learning tests (#45542, #46299)
  • Make Tune trial ID available in EnvRunners (and callbacks). (#46294)
  • Add env- and agent_steps to custom evaluation function. (#45652)
  • Remove default-metrics from Algorithm (tune does NOT error anymore if any stop-metric is missing). (#46200)

🔨 Fixes:

📖 Documentation:

  • Example for new API stack: Offline RL (BC) training on single-agent, while evaluating w/ multi-agent setup. (#46251)
  • Example for new API stack: Custom RLModule with an LSTM. (#46276)

Ray Core

🎉 New Features:

  • aDAG Developer Preview.

💫 Enhancements:

  • Allow env setup logger encoding (#46242)
  • ray list tasks filter state and name on GCS side (#46270)
  • Log ray version and ray commit during GCS start (#46341)

🔨 Fixes:

  • Decrement lineage ref count of an actor when the actor task return object reference is deleted (#46230)
  • Fix negative ALIVE actors metric and introduce IDLE state (#45718)
  • psutil process attr num_fds is not available on Windows (#46329)

Dashboard

🎉 New Features:

  • Added customizable refresh frequency for metrics on Ray Dashboard (#44037)

💫 Enhancements:

  • Upgraded to MUIv5 and React 18 (#45789)

🔨 Fixes:

  • Fix for multi-line log items breaking log viewer rendering (#46391)
  • Fix for UI inconsistency when a job submission creates more than one Ray job. (#46267)
  • Fix filtering by job id for tasks API not filtering correctly. (#45017)

Docs

🔨 Fixes:

  • Re-enabled automatic cross-reference link checking for Ray documentation, with Sphinx nitpicky mode (#46279)
  • Enforced naming conventions for public and private APIs to maintain accuracy, starting with Ray Data API documentation (#46261)

📖 Documentation:

  • Upgrade Python 3.12 support to alpha, marking the release of the Ray wheel to PyPI and conducting a sanity check of the most critical tests.

Thanks

Many thanks to all those who contributed to this release!

@stephanie-wang, @MortalHappiness, @aslonnie, @ryanaoleary, @jjyao, @jackhumphries, @nikitavemuri, @woshiyyya, @JoshKarpel, @ruisearch42, @sven1977, @alanwguo, @GeneDer, @saihaj, @raulchen, @liuxsh9, @khluu, @cristianjd, @scottjlee, @bveeramani, @zcin, @simonsays1980, @SumanthRH, @davidxia, @can-anyscale, @peytondmurray, @kevin85421

Ray-2.31.0

26 Jun 22:06
1240d3f
Compare
Choose a tag to compare

Ray Libraries

Ray Data

🔨 Fixes:

  • Fixed bug where preserve_order doesn’t work with file reads (#46135)

📖 Documentation:

  • Added documentation for dataset.Schema (#46170)

Ray Train

💫 Enhancements:

  • Add API for Ray Train run stats (#45711)

Ray Tune

💫 Enhancements:

  • Missing stopping criterion should not error (just warn). (#45613)

📖 Documentation:

  • Fix broken references in Ray Tune documentation (#45233)

Ray Serve

WARNING: the following default values will change in Ray 2.32:

  • Default for max_ongoing_requests will change from 100 to 5.
  • Default for target_ongoing_requests will change from 1 to 2.

💫 Enhancements:

  • Optimize DeploymentStateManager.get_deployment_statuses (#45872)

🔨 Fixes:

  • Fix logging error on passing traceback object into exc_info (#46105)
  • Run del even if constructor is still in-progress (#45882)
  • Spread replicas with custom resources in torch tune serve release test (#46093)
  • [1k release test] don't run replicas on head node (#46130)

📖 Documentation:

  • Remove todo since issue is fixed (#45941)

RLlib

🎉 New Features:

  • IMPALA runs on the new API stack (with EnvRunners and ConnectorV2s). (#42085)
  • SAC/DQN: Prioritized multi-agent episode replay buffer. (#45576)

💫 Enhancements:

  • New API stack stability: Add systematic CI learning tests for all possible combinations of: [PPO|IMPALA] + [1CPU|2CPU|1GPU|2GPU] + [single-agent|multi-agent]. (#46162, #46161)

📖 Documentation:

  • New API stack: Example script for action masking (#46146)
  • New API stack: PyFlight example script cleanup (#45956)
  • Old API stack: Enhanced ONNX example (+LSTM). (#43592)

Ray Core and Ray Clusters

Ray Core

💫 Enhancements:

  • [runtime-env] automatically infer worker path when starting worker in container (#42304)

🔨 Fixes:

  • On GCS restart, destroy not forget the unused workers. Fixing PG leaks. (#45854)
  • Cancel lease requests before returning a PG bundle (#45919)
  • Fix boost fiber stack overflow (#46133)

Thanks

Many thanks to all those who contributed to this release!

@jjyao, @kevin85421, @vincent-pli, @khluu, @simonsays1980, @sven1977, @rynewang, @can-anyscale, @richardsliu, @jackhumphries, @alexeykudinkin, @bveeramani, @ruisearch42, @shrekris-anyscale, @stephanie-wang, @matthewdeng, @zcin, @hongchaodeng, @ryanaoleary, @liuxsh9, @GeneDer, @aslonnie, @peytondmurray, @Bye-legumes, @woshiyyya, @scottjlee, @JoshKarpel

Ray-2.30.0

20 Jun 23:08
97c3729
Compare
Choose a tag to compare

Ray Libraries

Ray Data

💫 Enhancements:

  • Improve fractional CPU/GPU formatting (#45673)
  • Use sampled fragments to estimate Parquet reader batch size (#45749)
  • Refactoring ParquetDatasource and metadata fetching logic (#45728, #45727, #45733, #45734, #45767)
  • Refactor planner.py (#45706)

Ray Tune

💫 Enhancements:

  • Change the behavior of a missing stopping criterion metric to warn instead of raising an error. This enables the use case of reporting different sets of metrics on different iterations (ex: a separate set of training and validation metrics). (#45613)

Ray Serve

💫 Enhancements:

  • Create internal request id to track request objects (#45761)

RLLib

💫 Enhancements:

🔨 Fixes:

📖 Documentation:

  • Re-do examples overview page (new API stack): #45382
    • PyFlyt QuadX WayPoints example #44758, #45956
    • RLModule inference on new API stack (#45831, #45845)
    • How to resume a tune.Tuner.fit() experiment from checkpoint. (#45681)
    • Custom RLModule (tiny CNN): #45774
    • Connector examples docstrings (#45864)
  • Old API stack examples: #43592, #45829

Ray Core

🎉 New Features:

  • Alpha release of job level logging configuration: users can now config the user logging to be logfmt format with logging context attached. (#45344)

💫 Enhancements:

  • Integrate amdsmi in AMDAcceleratorManager (#44572)

🔨 Fixes:

  • Fix the C++ GcsClient Del not respecting del_by_prefix (#45604)
  • Fix exit handling of FiberState threads (#45834)

Dashboard

💫 Enhancements:

  • Parse out json logs (#45853)

Many thanks to all those who contributed to this release: @liuxsh9, @peytondmurray, @pcmoritz, @GeneDer, @saihaj, @khluu, @aslonnie, @yucai, @vickytsang, @can-anyscale, @bthananjeyan, @raulchen, @hongchaodeng, @x13n, @simonsays1980, @peterghaddad, @kevin85421, @rynewang, @angelinalg, @jjyao, @BenWilson2, @jackhumphries, @zcin, @chris-ray-zhang, @c21, @shrekris-anyscale, @alanwguo, @stephanie-wang, @Bye-legumes, @sven1977, @WeichenXu123, @bveeramani, @nikitavemuri

Ray-2.24.0

06 Jun 18:16
Compare
Choose a tag to compare

Ray Libraries

Ray Data

🎉 New Features:

  • Allow user to configure timeout for actor pool (#45508)
  • Add override_num_blocks to from_pandas and perform auto-partition (#44937)
  • Upgrade Arrow version to 16 in CI (#45565)

💫 Enhancements:

  • Clarify that num_rows_per_file isn't strict (#45529)
  • Record more telemetry for newly added datasources (#45647)
  • Avoid pickling LanceFragment when creating read tasks for Lance (#45392)

Ray Train

📖 Documentation:

  • [HPU] Add example of Stable Diffusion fine-tuning and serving on Intel Gaudi (#45217)
  • [HPU] Add example of Llama-2 fine-tuning on Intel Gaudi (#44667)

Ray Tune

🏗 Architecture refactoring:

  • Improve excessive syncing warning and deprecate TUNE_RESULT_DIR, RAY_AIR_LOCAL_CACHE_DIR, local_dir (#45210)

Ray Serve

💫 Enhancements:

  • Clean up Serve proxy files (#45486)

📖 Documentation:

  • vllm example to serve llm models (#45430)

RLLib

💫 Enhancements:

  • DreamerV3 on tf: Bug fix, so it can run again with tf==2.11.1 (2.11.0 is not available anymore) (#45419); Added weekly release test for DreamerV3.
  • Added support for multi-agent off-policy algorithms (DQN and SAC) in the new (#45182)
  • Config option for APPO/IMPALA to change number of GPU-loader threads (#45467)

🔨 Fixes:

📖 Documentation:

  • Example script for new API stack: How-to restore 1 of n agents from a checkpoint. (#45462)
  • Example script for new API stack: Autoregressive action module. #45525

Ray Core

💫 Enhancements:

🔨 Fixes:

  • Fix worker crash when getting actor name from runtime context (#45194)
  • log dedup should not dedup number only lines (#45385)

📖 Documentation:

  • Improve doc for --object-store-memory to describe how the default value is set (#45301)

Dashboard

🔨 Fixes:

  • Move Job package uploading to another thread to unblock the event loop. (#45282)

Many thanks to all those who contributed to this release: @maxliuofficial, @simonsays1980, @GeneDer, @dudeperf3ct, @khluu, @justinvyu, @andrewsykim, @Catch-Bull, @zcin, @bveeramani, @rynewang, @angelinalg, @matthewdeng, @jjyao, @kira-lin, @harborn, @hongchaodeng, @peytondmurray, @aslonnie, @timkpaine, @982945902, @maxpumperla, @stephanie-wang, @ruisearch42, @alanwguo, @can-anyscale, @c21, @Atry, @KamenShah, @sven1977, @raulchen

Ray-2.23.0

22 May 23:37
a0947ea
Compare
Choose a tag to compare

Ray Libraries

Ray Data

🎉 New Features:

  • Add support for using GPUs with map_groups (#45305)
  • Add support for using actors with map_groups (#45310)

💫 Enhancements:

  • Refine exception handling from arrow data conversion (#45294)

🔨 Fixes:

  • Fix Ray databricks UC reader with dynamic Databricks notebook scope token (#45153)
  • Fix bug where you can't return objects and array from UDF (#45287 )
  • Fix bug where map_groups triggers execution during input validation (#45314)

Ray Tune

🔨 Fixes:

  • [tune] Fix PB2 scheduler error resulting from trying to sort by Trial objects (#45161)

Ray Serve

🔨 Fixes:

  • Log application unhealthy errors at error level instead of warning level (#45211)

RLLib

💫 Enhancements:

  • Examples and tuned_examples learning test for new API stack are now “self-executable” (don’t require a third-party script anymore to run them). + WandB support. (#45023)

🔨 Fixes:

  • Fix result dict “spam” (duplicate, deprecated keys, e.g. “sampler_results” dumped into top level). (#45330)

📖 Documentation:

  • Add example for training with fractional GPUs on new API stack. (#45379)
  • Cleanup examples folder and remove deprecated sub directories. (#45327)

Ray Core

💫 Enhancements:

  • [Logs] Add runtime env started logs to job driver (#45255)
  • ray.util.collective support torch.bfloat16 (#39845)
  • [Core] Better propagate node death information (#45128)

🔨 Fixes:

  • [Core] Fix worker process leaks after job finishes (#44214)

Many thanks to all those who contributed to this release: @hongchaodeng, @khluu, @antoni-jamiolkowski, @ameroyer, @bveeramani, @can-anyscale, @WeichenXu123, @peytondmurray, @jackhumphries, @kevin85421, @jjyao, @robcaulk, @rynewang, @scottsun94, @swang, @GeneDer, @zcin, @ruisearch42, @aslonnie, @angelinalg, @raulchen, @ArthurBook, @sven1977, @wuxibin89

Ray-2.22.0

14 May 23:39
a8ab7b8
Compare
Choose a tag to compare

Ray Libraries

Ray Data

🎉 New Features:

  • Add function to dynamically generate ray_remote_args for Map APIs (#45143)
  • Allow manually setting resource limits for training jobs (#45188)

💫 Enhancements:

  • Introduce abstract interface for data autoscaling (#45002)
  • Add debugging info for SplitCoordinator (#45226)

🔨 Fixes:

  • Don’t show AllToAllOperator progress bar if the disable flag is set (#45136)
  • Don't load Arrow PyExtensionType by default (#45084)
  • Don't raise batch size error if num_gpus=0 (#45202)

Ray Train

💫 Enhancements:

  • [XGBoost][LightGBM] Update RayTrainReportCallback to only save checkpoints on rank 0 (#45083)

Ray Core

🔨 Fixes:

  • Fix the cpu percentage metrics for dashboard process (#45124)

Dashboard

💫 Enhancements:

  • Improvements to log viewer so line numbers do not get selected when copying text.
  • Improvements to the log viewer to avoid unnecessary re-rendering which causes text selection to clear.

Many thanks to all those who contributed to this release: @justinvyu, @simonsays1980, @chris-ray-zhang, @kevin85421, @angelinalg, @rynewang, @brycehuang30, @alanwguo, @jjyao, @shaikhismail, @khluu, @can-anyscale, @bveeramani, @jrosti, @WeichenXu123, @MortalHappiness, @raulchen, @scottjlee, @ruisearch42, @aslonnie, @alexeykudinkin

Ray-2.21.0

08 May 20:34
a912be8
Compare
Choose a tag to compare

Ray Libraries

Ray Data

🎉 New features:

  • Add read_lance API to read Lance Dataset (#45106)

🔨 Fixes:

  • Retry RaySystemError application errors (#45079)

📖 Documentation:

  • Fix broken references in data documentation (#44956)

Ray Train

📖 Documentation:

  • Fix broken links in Train documentation (#44953)

Ray Tune

📖 Documentation:

  • Update Hugging Face example to add reference (#42771)

🏗 Architecture refactoring:

  • Remove deprecated ray.air.callbacks modules (#45104)

Ray Serve

💫 Enhancements:

  • Allow methods to pass type @serve.batch type hint (#45004)
  • Allow configuring Serve control loop interval (#45063)

🔨 Fixes:

  • Fix bug with controller failing to recover for autoscaling deployments (#45118)
  • Fix control+c after serve run doesn't shutdown serve components (#45087)
  • Fix lightweight update max ongoing requests (#45006)

RLlib

🎉 New Features:

  • New MetricsLogger API now fully functional on the new API stack (working now also inside Learner classes, i.e. loss functions). (#44995, #45109)

💫 Enhancements:

  • Renamings and cleanups (toward new API stack and more consistent naming schemata): WorkerSet -> EnvRunnerGroup, DEFAULT_POLICY_ID -> DEFAULT_MODULE_ID, config.rollouts() -> config.env_runners(), etc.. (#45022, #44920)
  • Changed behavior of EnvRunnerGroup.foreach_worker… methods to new defaults: mark_healthy=True (used to be False) and healthy_only=True (used to be False). (#44993)
  • Fix get_state()/from_state() methods in SingleAgent- and MultiAgentEpisodes. (#45012)

🔨 Fixes:

📖 Documentation:

  • Example scripts using the MetricsLogger for env rendering and recording w/ WandB: #45073, #45107

Ray Core

🔨 Fixes:

  • Fix ray.init(logging_format) argument is ignored (#45037)
  • Handle unserializable user exception (#44878)
  • Fix dashboard process event loop blocking issues (#45048, #45047)

Dashboard

🔨 Fixes:

  • Fix Nodes page sorting not working correctly.
  • Add back “actors per page” UI control in the actors page.

Many thanks to all those who contributed to this release: @rynewang, @can-anyscale, @scottsun94, @bveeramani, @ceddy4395, @GeneDer, @zcin, @JoshKarpel, @nikitavemuri, @stephanie-wang, @jackhumphries, @matthewdeng, @yash97, @simonsays1980, @peytondmurray, @evalaiyc98, @c21, @alanwguo, @shrekris-anyscale, @kevin85421, @hongchaodeng, @sven1977, @st--, @khluu