Add API to assemble CPU shards to a sharded tensor #5681

jonb377 · 2023-10-05T22:02:53Z

This PR reintroduces #5630, which was reverted in #5680 due to failing CI on master.

The following patch shows the difference between this and the original PR:

diff --git a/torch_xla/csrc/init_python_bindings.cpp b/torch_xla/csrc/init_python_bindings.cpp
index fe18f9508..421066ba7 100644
--- a/torch_xla/csrc/init_python_bindings.cpp
+++ b/torch_xla/csrc/init_python_bindings.cpp
@@ -1720,8 +1720,8 @@ void InitXlaModuleBindings(py::module m) {
               << " vs " << expected_shard_shape;
         }
 
-        auto data_handle = WrapXlaData(ShardingUtil::CreateShardedData(
-            shards, local_devices, sharding_spec));
+        auto data_handle = ShardingUtil::CreateShardedData(
+            shards, local_devices, sharding_spec);
         XLATensorPtr xla_tensor = XLATensor::Create(std::move(data_handle));
         xla_tensor->SetShardingSpec(*sharding_spec);
         auto tensor = bridge::AtenFromXlaTensor(std::move(xla_tensor));

alanwaketan

LGTM.

jonb377 · 2023-10-06T18:14:29Z

Kokoro failure is due to a dependency issue:

ERROR: Could not find a version that satisfies the requirement tf-nightly (from versions: none)
ERROR: No matching distribution found for tf-nightly

I'll merge after TPU CI. Thanks Jiewen!

jonb377 · 2023-10-09T17:49:30Z

Looking into the TPU CI failure, that's new since the rebase. Passes locally on v4, it may be that the test breaks with 8 devices.

jonb377 · 2023-10-09T18:23:11Z

Surprisingly the test actually failed on the original PR, but TPU CI still passed: https://github.com/pytorch/xla/runs/17442958115

* Add API to assemble CPU shards to a sharded tensor * Handle replicated sharding * Move validations into get_op_sharding * Improve tests and error handling * Don't WrapXlaData * Fix test for v3

jonb377 added 5 commits October 5, 2023 21:47

Add API to assemble CPU shards to a sharded tensor

dd6a770

Handle replicated sharding

ca0ee01

Move validations into get_op_sharding

23cfcc4

Improve tests and error handling

e4e7ffd

Don't WrapXlaData

31793b2

jonb377 requested a review from alanwaketan October 5, 2023 22:02

alanwaketan approved these changes Oct 5, 2023

View reviewed changes

Fix test for v3

371f99c

jonb377 merged commit d9a9049 into master Oct 9, 2023
18 of 19 checks passed

jonb377 deleted the jonbolin-assemble-shards branch October 9, 2023 21:15

mbzomowski mentioned this pull request Nov 16, 2023

tpu ci module refactor mbzomowski-test-org/xla#7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API to assemble CPU shards to a sharded tensor #5681

Add API to assemble CPU shards to a sharded tensor #5681

jonb377 commented Oct 5, 2023

alanwaketan left a comment

jonb377 commented Oct 6, 2023

jonb377 commented Oct 9, 2023 •

edited

Loading

jonb377 commented Oct 9, 2023

Add API to assemble CPU shards to a sharded tensor #5681

Add API to assemble CPU shards to a sharded tensor #5681

Conversation

jonb377 commented Oct 5, 2023

alanwaketan left a comment

Choose a reason for hiding this comment

jonb377 commented Oct 6, 2023

jonb377 commented Oct 9, 2023 • edited Loading

jonb377 commented Oct 9, 2023

jonb377 commented Oct 9, 2023 •

edited

Loading