Skip to content

Commit

Permalink
[BUG] Fix ray wait in RayPartitionSet (#3251)
Browse files Browse the repository at this point in the history
Closes: #3249 

Ray's `ray.wait` is supposed to:

1. Defaults to `fetch_local=True` which will supposedly fetch data to
wherever the wait is called before returning
2. Defaults to `num_returns=1` which will wait until only the first item
is ready before returning

This seems to not be the intended behavior here, where `RayPartitionSet`
is trying to wait on ALL the partitions to be ready, and does not want
to pull any data down to the calling site.

Co-authored-by: Jay Chia <jaychia94@gmail.com@users.noreply.github.com>
  • Loading branch information
jaychia and Jay Chia authored Nov 15, 2024
1 parent cf25ad4 commit 84db665
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion daft/runners/ray_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -341,7 +341,7 @@ def num_partitions(self) -> int:

def wait(self) -> None:
deduped_object_refs = {r.partition() for r in self._results.values()}
ray.wait(list(deduped_object_refs))
ray.wait(list(deduped_object_refs), fetch_local=False, num_returns=len(deduped_object_refs))


def _from_arrow_type_with_ray_data_extensions(arrow_type: pa.lib.DataType) -> DataType:
Expand Down

0 comments on commit 84db665

Please sign in to comment.