aot_autograd: avoid using intermediate_base logic unnecessarily #97786

bdhirsh · 2023-03-28T15:19:56Z

fixes #97691, see the issue for the proposed design. Now that we are employing AOTAutograd's "intermediate base" logic a lot less frequently, we might see some speedups in the benchmark suite.

Stack from ghstack (oldest at bottom):

cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire

[ghstack-poisoned]

pytorch-bot · 2023-03-28T15:19:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/97786

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 164fb34:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: c8cddadc4ec99084a53c7312433feb9eebfc727b Pull Request resolved: #97786

albanD · 2023-03-28T18:44:44Z

test/functorch/test_aotdispatch.py

+                    # In cases where we know that an output's view-ness is safe to hide from autograd
+                    # (the output is a view of an intermediate that doesn't escape the graph),
+                    # we hide the view-ness from autograd.
+                    # self.assertEqual(ref_o._is_view(), test_o._is_view())


We should still be able to test this no?
In particular, check that if ref_o is a view of an input or another output, we preserve that ?

We can no longer test for it unconditionally - one thing I can do is plumb a bool into this test helper so we know when to test for it (I'll probably do that).

You have access to ref_o and ref_inp here right?
So you should be able to check, the pseudo code I have in mind is:

def get_base(t): return t._base if t._is_view() else t def is_in_base(t, tensors): t_base = get_base(t) for tensor in tensors: if t_base is get_base(tensor): return True return False ref_is_view_of_non_interm = is_in_base(ref_o, ref_inps) or is_in_base(ref_o, ref_outs) test_is_view_of_non_interm = is_in_base(test_o, test_inps) or is_in_base(test_o, test_outs) assert ref_is_view_of_non_interm == test_is_view_of_non_interm

that test is definitely better, thanks - I'll add it.

torch/_functorch/aot_autograd.py

albanD · 2023-03-28T18:50:10Z

torch/_functorch/aot_autograd.py

            if info.output_type == OutputType.alias_of_intermediate_save_as_output:
                intermediate_bases.append(o._base)
+            elif info.output_type == OutputType.unsafe_view_alias:
+                # See Note [Intermediate Bases Optimization]
+                outs[i] = torch.ops.aten._unsafe_view(o, o.shape)


nit: always call specific overload. the overload resolution from jit used here is dead slow

I don't think this matters too much, because this is run at trace time (and we'll still end up baking the overload into the graph). But good to know - I'll update.

Compilation time is still something we want to keep down no? :D
But yes most likely not a big issue

agreed 😛 (to be far this op will be called usually a handful of times per compiled subgraph, so pretty infrequently (at most, # graph outputs, if every output is an alias of a graph intermediate))

…arily" fixes #97691, see the issue for the proposed design. Now that we are employing AOTAutograd's "intermediate base" logic a lot less frequently, we might see some speedups in the benchmark suite. [ghstack-poisoned]

ghstack-source-id: 0c01eca2f116f2a3add0789997f48832ef497fac Pull Request resolved: #97786

soulitzer

LGTM, just had a few minor questions

soulitzer · 2023-03-29T14:43:21Z

test/functorch/test_aotdispatch.py

+            out_test[0].mul_(3)
+            # Assert that the aliasing relationship was preserved
+            self.assertEqual(out_ref[0], out_test[0])
+            self.assertEqual(out_ref[1], out_test[1])


Do we test anywhere that aliasing relationship is preserved in the case of:

def f(a): b = a.clone() return b.view(-1), b.view(-1)

I think some other tests test for it more indirectly - but I can add one

soulitzer · 2023-03-29T15:13:26Z

torch/_functorch/aot_autograd.py

@@ -815,7 +840,7 @@ def inner(*flat_args):
        f_output_tangents = [
            o
            for o, info in zip(flat_f_outs, output_info)
-            if info.output_type == OutputType.non_alias and issubclass(info.raw_type, torch.Tensor)
+            if info.output_type in [OutputType.non_alias, OutputType.unsafe_view_alias] and issubclass(info.raw_type, torch.Tensor)


Is the comment above still up to date?

soulitzer · 2023-03-29T15:13:45Z

torch/_functorch/aot_autograd.py

@@ -953,7 +983,7 @@ def inner_fn(*args):
        # For outputs that are aliases of intermediates, we will have returned the output's _base as an output in the graph instead,
        # which we *should* send to grad()
        output_grad_mask = [
-            meta.output_info[i].output_type == OutputType.non_alias
+            meta.output_info[i].output_type in [OutputType.non_alias, OutputType.unsafe_view_alias]


Also the comment here

soulitzer · 2023-03-29T15:25:34Z

torch/_functorch/aot_autograd.py

@@ -481,7 +483,7 @@ def __post_init__(self):
        self.aliased_out_indices = aliased_out_indices
        self.num_outputs = len(self.output_info)
        self.num_outputs_non_aliased = len(
-            [x for x in self.output_info if x.output_type == OutputType.non_alias]
+            [x for x in self.output_info if x.output_type in [OutputType.non_alias, OutputType.unsafe_view_alias]]


Is it worth updating the name of this field? "num_outputs_non_aliased" now that it also counts outputs that may be aliases?

My thought here is that for all intents and purposes w.r.t. AOTAutograd, these tensors are non-aliased. I would have liked to not need a new OutputType, and just use non_alias for these tensors - but I needed some way of telling the joint code in AOTAutograd to insert an unsafe_view() later (I could have added a separate metadata, but an extra OutputType felt easier).

Let me know if that doesn't seem reasonable to you though!

fwiw, Alban also pointed out another option - don't insert unsafe_view here in AOTAutograd, and instead require the backend compiler to run ADInplaceOrView keys inside of its compiled kernel (that's what happens by default today, since inductor is just generated as_strided()). I figured I would do this simpler thing first that doesn't involve changing inductor code, and leave that for later work.

#97786 might be a better fix for this one, hopefully we can land that then discard this PR cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire [ghstack-poisoned]

…arily" fixes #97691, see the issue for the proposed design. Now that we are employing AOTAutograd's "intermediate base" logic a lot less frequently, we might see some speedups in the benchmark suite. cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]

bdhirsh · 2023-03-30T22:12:08Z

@pytorchbot merge

pytorchmergebot · 2023-03-30T22:15:23Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-03-31T04:13:51Z

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

bdhirsh · 2023-03-31T16:23:13Z

@pytorchbot merge

pytorchmergebot · 2023-03-31T16:25:08Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

aot_autograd: avoid using intermediate_base logic unnecessarily

49f481e

[ghstack-poisoned]

bdhirsh requested review from ezyang and Chillee as code owners March 28, 2023 15:19

pytorch-bot bot added release notes: AO frontend labels Mar 28, 2023

bdhirsh added a commit that referenced this pull request Mar 28, 2023

aot_autograd: avoid using intermediate_base logic unnecessarily

927c39c

ghstack-source-id: c8cddadc4ec99084a53c7312433feb9eebfc727b Pull Request resolved: #97786

github-actions bot requested review from albanD, antoniojkim, jbschlosser, miladm, SherlockNoMad, voznesenskym and wconstab March 28, 2023 15:20

bdhirsh requested a review from soulitzer March 28, 2023 15:41

jansel approved these changes Mar 28, 2023

View reviewed changes

albanD reviewed Mar 28, 2023

View reviewed changes

bdhirsh added a commit that referenced this pull request Mar 28, 2023

aot_autograd: avoid using intermediate_base logic unnecessarily

d3cc8e1

ghstack-source-id: 0c01eca2f116f2a3add0789997f48832ef497fac Pull Request resolved: #97786

github-actions bot added the module: dynamo label Mar 28, 2023

soulitzer approved these changes Mar 29, 2023

View reviewed changes

jansel mentioned this pull request Mar 29, 2023

Disable intermediate_base in inductor #97739

Closed

This was referenced Mar 30, 2023

fix per-dispatchkey-mode caching bug #98030

Closed

change torch._dynamo.export(aten_graph=...) to allow pre_autograd tracing #98031

Closed

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 30, 2023

pytorchmergebot added the Merged label Mar 31, 2023

pytorchmergebot closed this in 864ab93 Mar 31, 2023

facebook-github-bot deleted the gh/bdhirsh/402/head branch June 8, 2023 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aot_autograd: avoid using intermediate_base logic unnecessarily #97786

aot_autograd: avoid using intermediate_base logic unnecessarily #97786

bdhirsh commented Mar 28, 2023 •

edited

Loading

pytorch-bot bot commented Mar 28, 2023 •

edited

Loading

albanD Mar 28, 2023

bdhirsh Mar 28, 2023

albanD Mar 28, 2023

bdhirsh Mar 28, 2023

albanD Mar 28, 2023

bdhirsh Mar 28, 2023

albanD Mar 28, 2023 •

edited

Loading

bdhirsh Mar 28, 2023

soulitzer left a comment

soulitzer Mar 29, 2023

bdhirsh Mar 29, 2023

soulitzer Mar 29, 2023

soulitzer Mar 29, 2023

soulitzer Mar 29, 2023

bdhirsh Mar 29, 2023

bdhirsh commented Mar 30, 2023

pytorchmergebot commented Mar 30, 2023

pytorchmergebot commented Mar 31, 2023

bdhirsh commented Mar 31, 2023

pytorchmergebot commented Mar 31, 2023

aot_autograd: avoid using intermediate_base logic unnecessarily #97786

aot_autograd: avoid using intermediate_base logic unnecessarily #97786

Conversation

bdhirsh commented Mar 28, 2023 • edited Loading

pytorch-bot bot commented Mar 28, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/97786

✅ No Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

albanD Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soulitzer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdhirsh commented Mar 30, 2023

pytorchmergebot commented Mar 30, 2023

Merge started

pytorchmergebot commented Mar 31, 2023

bdhirsh commented Mar 31, 2023

pytorchmergebot commented Mar 31, 2023

Merge started

bdhirsh commented Mar 28, 2023 •

edited

Loading

pytorch-bot bot commented Mar 28, 2023 •

edited

Loading

albanD Mar 28, 2023 •

edited

Loading