[data] [streaming] Support a streaming_repartition() operator #36724
Labels
data
Ray Data-related issues
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
In several use cases, it is useful to change the block size of datasets in a streaming way. The current
repartition()
operator is an all-to-all operator and is incompatible with streaming.We could implement a general purpose
streaming_repartition()
operator that supports repartitioning in a few streaming-compatible ways:This could be implemented as a new PhysicalOperator that implements the online repartitioning. This could also replace the current SplitBlocks mechanism from #36352
The text was updated successfully, but these errors were encountered: