Forward `XLATensorImpl::is_contiguous_custom` to `TensorImpl`. #8032

ysiraichi · 2024-09-17T17:51:34Z

This PR fixes #7998. Instead of always returning true, we forward this call to the base class TensorImpl::is_contiguous_custom().

The reason is that after pytorch/pytorch#135498 is merged, XLA tensors' metadata might stop reflecting on the actual XLA storage. Which means that the tensors' strides might not always be contiguous. Whenever that happens, tensor.is_contiguous() call should be consistent with the tensors' strides.

cc @miladm @JackCaoG @alanwaketan

JackCaoG · 2024-09-17T20:19:25Z

test failure seems real?

JackCaoG · 2024-09-18T18:53:18Z

yea test_memory_format_preserved_after_permute_xla still failing.

…disabled.

ysiraichi · 2024-09-20T22:02:29Z

test/spmd/test_xla_sharding.py

@@ -1068,7 +1068,7 @@ def test_backward_optimization_barrier(self):

    hlo = torch_xla._XLAC._get_xla_tensors_hlo([model.fc2.weight.grad])
    self.assertIn(
-        '%opt-barrier.37 = (f32[1,64]{0,1}, f32[1]{0}, f32[2,64]{1,0}) opt-barrier((f32[1,64]{0,1}, f32[1]{0}, f32[2,64]{1,0}) %tuple.36)',
+        '%opt-barrier.38 = (f32[1,64]{1,0}, f32[1]{0}, f32[2,64]{1,0}) opt-barrier((f32[1,64]{1,0}, f32[1]{0}, f32[2,64]{1,0}) %tuple.37)',


This was needed for fixing a CI failure in pytorch/pytorch#135237. @JackCaoG let me know if you think this should not be happening.

ysiraichi · 2024-09-20T22:04:06Z

torch_xla/experimental/scan.py

The changes in this file are needed because it seems that PyTorch is running these tests with Python 3.8. Thus, the subscript operator is not allowed for types.

hopefully pytorch/pytorch#135278 can be merged and we don't have this issue anymore in the future...

miladm · 2024-09-23T17:06:41Z

@JackCaoG do we want to add this PR to 2.5 release? (knowing it has some upstream dependencies to consider - @ysiraichi please reference the dependencies for clarity)

JackCaoG · 2024-09-23T17:22:29Z

no, I don't want to add features to 2.5 releases at this point.

JackCaoG · 2024-09-23T20:17:57Z

torch_xla/csrc/tensor_impl.cpp

+  // If functionalization is disabled, the tensors' metadata aren't being
+  // updated w.r.t. the output of meta functions. Therefore, we fallback to the
+  // old behavior returning true, always.
+  if (runtime::sys_util::GetEnvBool("XLA_DISABLE_FUNCTIONALIZATION", false)) {


maybe just add a IsFunctionalizationDisabled(use a static bool to store the value of this env var) in https://github.com/pytorch/xla/blob/ea8c47a345a29c6a1b1bdf4ee38a9159b07a980f/torch_xla/csrc/xla_graph_executor.h or in tensor.h instead of keep getting the from the env var

ysiraichi · 2024-09-23T21:03:57Z

I don't think the approach in this PR will be enough for propagating the metadata. I think I will have to modify the functionalizeFallback function instead, so that all operations propagate the metadata correctly. This will probably add more overhead to the execution, though. @JackCaoG what do you think?

JackCaoG · 2024-09-23T21:06:29Z

would be good to measure those overheads, my experience is that most of the C++ ops are pretty fast except creating large objects and calculating hash. It might be not too bad in this case too.

ysiraichi · 2024-09-26T13:00:17Z

The reason why I'm asking this is because of #7923. Basically, what I'm saying is that we will have to call the meta functions of every operation that goes through the functional fallback function (e.g. add(), clone(), etc). That might end up calling a Python meta function.

@JackCaoG what do you think?

JackCaoG · 2024-09-26T17:13:18Z

I am already concern about the python meta function overhead we have today so ideally I don't want to introduce more. Do you mind give it a try and see how many tracing time overhad does it introduce?

miladm · 2024-10-16T16:43:42Z

@ysiraichi do we know what else is remaining for this PR to become ready for merging?

ysiraichi · 2024-10-16T20:32:37Z

Before merging this branch, we need to first merge its PyTorch counterpart pytorch/pytorch#135498. In order to merge that one, we need to propagate the Tensor metadata for all operations: manually written kernels and codegen'd ones. Note that this might break previous assumptions that XLA tensors contiguous() calls would always return true.

ysiraichi added 3 commits September 17, 2024 12:30

Add test.

857867d

Forward is_contiguous_custom call to TensorImpl.

4c897e7

Add comments.

dfc32cf

ysiraichi requested review from alanwaketan and JackCaoG September 17, 2024 17:51

ysiraichi added 2 commits September 17, 2024 15:05

Fix lint issues.

ca02543

Fix test.

408ae74

ysiraichi mentioned this pull request Sep 18, 2024

Preserve storage offset on meta conversions. pytorch/pytorch#135498

Open

Run meta function on clone.

8c324e0

ysiraichi added 6 commits September 18, 2024 17:51

Add test for contiguity on different memory formats.

5d914a4

Fallback to old is_contiguous() behavior when functionalization is …

780d2c6

…disabled.

Fix lint issues.

b045e2c

Fix lint issues.

8239b9b

Fix type hints for Python 3.8.

fc49779

Fix test.

d42b1c4

ysiraichi commented Sep 20, 2024

View reviewed changes

alanwaketan removed their request for review September 20, 2024 23:31

ysiraichi mentioned this pull request Sep 23, 2024

Failing Torchbench Models: tracking issue #5932

Open

JackCaoG reviewed Sep 23, 2024

View reviewed changes

JackCaoG approved these changes Sep 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forward `XLATensorImpl::is_contiguous_custom` to `TensorImpl`. #8032

Forward `XLATensorImpl::is_contiguous_custom` to `TensorImpl`. #8032

ysiraichi commented Sep 17, 2024

JackCaoG commented Sep 17, 2024

JackCaoG commented Sep 18, 2024

ysiraichi Sep 20, 2024

ysiraichi Sep 20, 2024

JackCaoG Sep 23, 2024

miladm commented Sep 23, 2024 •

edited

Loading

JackCaoG commented Sep 23, 2024

JackCaoG Sep 23, 2024

ysiraichi commented Sep 23, 2024

JackCaoG commented Sep 23, 2024

ysiraichi commented Sep 26, 2024

JackCaoG commented Sep 26, 2024

miladm commented Oct 16, 2024

ysiraichi commented Oct 16, 2024

Forward XLATensorImpl::is_contiguous_custom to TensorImpl. #8032

Are you sure you want to change the base?

Forward XLATensorImpl::is_contiguous_custom to TensorImpl. #8032

Conversation

ysiraichi commented Sep 17, 2024

JackCaoG commented Sep 17, 2024

JackCaoG commented Sep 18, 2024

ysiraichi Sep 20, 2024

Choose a reason for hiding this comment

ysiraichi Sep 20, 2024

Choose a reason for hiding this comment

JackCaoG Sep 23, 2024

Choose a reason for hiding this comment

miladm commented Sep 23, 2024 • edited Loading

JackCaoG commented Sep 23, 2024

JackCaoG Sep 23, 2024

Choose a reason for hiding this comment

ysiraichi commented Sep 23, 2024

JackCaoG commented Sep 23, 2024

ysiraichi commented Sep 26, 2024

JackCaoG commented Sep 26, 2024

miladm commented Oct 16, 2024

ysiraichi commented Oct 16, 2024

Forward `XLATensorImpl::is_contiguous_custom` to `TensorImpl`. #8032

Forward `XLATensorImpl::is_contiguous_custom` to `TensorImpl`. #8032

miladm commented Sep 23, 2024 •

edited

Loading