Add fast traceback utilities #107358

ezyang · 2023-08-17T03:57:20Z

Stack from ghstack (oldest at bottom):

This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for traceback.extract_stack, which is 20x slower than CapturedTraceback.

I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful.

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @aakhundov

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

pytorch-bot · 2023-08-17T03:57:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107358

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long ROCm queue (2023-08-18)

✅ No Failures

As of commit 7432576 with merge base aa04b05 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 7dceac88e9465bb87ba36b0fb4fcca235ca10499 Pull Request resolved: #107358

Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

ezyang · 2023-08-17T15:54:36Z

torch/_dynamo/utils.py

-        self.kwargs = kwargs
-
-    def __str__(self):
-        return self.func(*self.args, **self.kwargs)


Just a small cleanup; this got moved to torch._logging. I actually didn't need it in this PR but I kept the cleanup. If you REALLY care I can split it out.

ezyang · 2023-08-17T15:54:51Z

torch/_guards.py

+    stack = None
+
+    def __post_init__(self):
+        self.stack = CapturedTraceback.extract(skip=2)


Not used yet but I plan to use it later.

ezyang · 2023-08-17T15:54:59Z

torch/fx/experimental/symbolic_shapes.py

@@ -1059,11 +1058,6 @@ def error():
    raise AssertionError("shouldn't be hit")


-def get_debugging_stack(num_frames_to_cut=2):
-    # cut this frame and the caller's frame by default
-    return ''.join(traceback.format_list(traceback.extract_stack()[:-num_frames_to_cut]))


ezyang · 2023-08-17T15:55:44Z

torch/fx/experimental/symbolic_shapes.py

+                        break
+                    frame = frame.f_back
+            finally:
+                del frame


I don't have the formatted tb handy anymore, so I reimplement the old code by manually traversing over the frame. This should be faster!

ezyang · 2023-08-17T15:56:18Z

torch/utils/_traceback.py

@@ -135,8 +136,8 @@ def shorten_filename(fn):
    directory is "obvious" and doesn't need to be shown to user.
    """
    # Truncate torch/foo.py to foo.py
-    prefix = os.path.commonprefix([fn, os.path.join(os.path.dirname(os.path.dirname(__file__)), "")])
-    return fn[len(prefix):]
+    prefix = os.path.commonpath([fn, os.path.join(os.path.dirname(os.path.dirname(__file__)), "")])


Small bug fix, this fixes "test/test_blah.py" rendering as "est/test_blah.py" because test and torch share 't'.

ezyang · 2023-08-17T15:56:59Z

torch/utils/_traceback.py

+            torch._C._profiler.gather_traceback(python=True, script=script, cpp=cpp),
+            # Elide extract() frame if we don't have script/cpp frames.  If
+            # we do have those frames, it doesn't work so force zero.
+            0 if script or cpp else skip + 1


I want to make this work better but delaying until I actually use the cpp traces somewhere...

This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx ipiszy chenyang78 aakhundov [ghstack-poisoned]

zdevito

Use of CapturedTraceback apis looks good to me. Having an API that mimics Python's traceback API will be useful, especially when we want TorchScript/inductor stacks to be readable. Didn't review the use of the new API in symbolic shapes closely because I am not familiar with that code

This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx ipiszy chenyang78 aakhundov [ghstack-poisoned]

ezyang · 2023-08-18T04:24:06Z

The symbolic shapes stuff is now in its own PR #107439 because apparently it causes a memory leak in cudagraph trees.

This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx ipiszy chenyang78 aakhundov [ghstack-poisoned]

albanD

SGTM

albanD · 2023-08-18T18:03:08Z

torch/utils/_traceback.py

+        self._summary = None
+
+    def summary(self):
+        import torch._C._profiler


Isn't that done by default? :o

It looks like this: ``` [DEBUG] GUARD: ___check_type_id(L['z'][L["MyEnum"].BAR], 7640416) and L['z'][L["MyEnum"].BAR] == 10 [DEBUG] Stack: [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 6657, in <module> [DEBUG] run_tests() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/test_case.py", line 38, in run_tests [DEBUG] run_tests() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 985, in run_tests [DEBUG] unittest.main(argv=argv) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 101, in __init__ [DEBUG] self.runTests() [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 271, in runTests [DEBUG] self.result = testRunner.run(self.test) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/runner.py", line 184, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__ [DEBUG] return self.run(*args, **kwds) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__ [DEBUG] return self.run(*args, **kwds) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 650, in __call__ [DEBUG] return self.run(*args, **kwds) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2521, in run [DEBUG] self._run_with_retry( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2450, in _run_with_retry [DEBUG] super_run(result=result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 591, in run [DEBUG] self._callTestMethod(testMethod) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 549, in _callTestMethod [DEBUG] method() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2377, in wrapper [DEBUG] method(*args, **kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2529, in test_enum_as_dict_key_with_overloaded_str [DEBUG] res = opt_fn(x) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 333, in _fn [DEBUG] return fn(*args, **kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2519, in fn [DEBUG] torch._dynamo.graph_break() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 493, in catch_errors [DEBUG] return callback(frame, cache_size, hooks, frame_state) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 637, in _convert_frame [DEBUG] result = inner_convert(frame, cache_size, hooks, frame_state) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 133, in _fn [DEBUG] return fn(*args, **kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 371, in _convert_frame_assert [DEBUG] return _compile( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 567, in _compile [DEBUG] guarded_code = compile_inner(code, one_graph, hooks, transform) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/utils.py", line 181, in time_wrapper [DEBUG] r = func(*args, **kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 466, in compile_inner [DEBUG] out_code = transform_code_object(code, transform) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object [DEBUG] transformations(instructions, code_options) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 416, in transform [DEBUG] tracer = InstructionTranslator( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2018, in __init__ [DEBUG] self.symbolic_locals = collections.OrderedDict( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2021, in <genexpr> [DEBUG] VariableBuilder( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__ [DEBUG] vt = self._wrap(value).clone(**self.options()) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 404, in _wrap [DEBUG] result = { [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 405, in <dictcomp> [DEBUG] k: VariableBuilder( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__ [DEBUG] vt = self._wrap(value).clone(**self.options()) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 354, in _wrap [DEBUG] return type_dispatch(self, value) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 837, in wrap_literal [DEBUG] return self.wrap_unspecialized_primitive(value) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 1073, in wrap_unspecialized_primitive [DEBUG] guards=self.make_guards(GuardBuilder.CONSTANT_MATCH), [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in make_guards [DEBUG] return {source.make_guard(guard) for guard in guards} [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in <setcomp> [DEBUG] return {source.make_guard(guard) for guard in guards} [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_guards.py", line 641, in make_guard [DEBUG] return Guard(self.name(), self.guard_sou ``` One downside is I can't report *why* the guard was added. I'm not entirely sure how to do this; the problem is guards will propagate to a bunch of variables before finally getting included as part of the final set. Maybe a very very verbose version could report stack traces at every handoff point. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #107388 Approved by: https://github.com/mlazos ghstack dependencies: #107438, #107358

Add fast traceback utilities

b76f8c8

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

ezyang added a commit that referenced this pull request Aug 17, 2023

Add fast traceback utilities

758a3cc

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 7dceac88e9465bb87ba36b0fb4fcca235ca10499 Pull Request resolved: #107358

github-actions bot requested review from albanD, antoniojkim, bdhirsh, jbschlosser, miladm, SherlockNoMad, voznesenskym and wconstab August 17, 2023 03:57

ezyang changed the title ~~Add fast traceback utilities~~ [EASY] Add fast traceback utilities Aug 17, 2023

Update on "[EASY] Add fast traceback utilities"

d5c3656

Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

github-actions bot added the ciflow/inductor label Aug 17, 2023

Update on "[EASY] Add fast traceback utilities"

357e457

Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

ezyang requested a review from zdevito August 17, 2023 15:42

pytorch-bot bot added the release notes: fx release notes category label Aug 17, 2023

ezyang requested review from BowenBao, abock, thiagocrepaldi and wschin as code owners August 17, 2023 15:53

github-actions bot added the module: dynamo label Aug 17, 2023

ezyang changed the title ~~[EASY] Add fast traceback utilities~~ Add fast traceback utilities Aug 17, 2023

ezyang requested a review from mlazos August 17, 2023 15:54

ezyang commented Aug 17, 2023

View reviewed changes

ezyang mentioned this pull request Aug 17, 2023

Add verbose_guards logging artifact #107388

Closed

ezyang added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 17, 2023

zdevito approved these changes Aug 17, 2023

View reviewed changes

This was referenced Aug 18, 2023

[dynamo][eval_frame] Unify cache entry and frame_state on the same co_extra index #106917

Closed

[dynamo][eval_frame] Set destroy_extra_state deleter as part of co_extra #107117

Closed

This was referenced Aug 18, 2023

Hand bind CapturedTraceback #107438

Closed

Use fast traceback for symbolic shapes #107439

Closed

This was referenced Aug 18, 2023

Correctly handle Python cycles that go through SymInt/Bool fields on TensorImpl #107457

Closed

Make SymNode hold a weak reference to ShapeEnv #107466

Closed

albanD approved these changes Aug 18, 2023

View reviewed changes

torch/utils/_traceback.py

self._summary = None

def summary(self):

import torch._C._profiler

Copy link

Collaborator

albanD Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that done by default? :o

ezyang mentioned this pull request Aug 18, 2023

Change how caching/cleanup for CapturedTraceback works #107471

Closed

pytorchmergebot added the Merged label Aug 18, 2023

pytorchmergebot closed this in 36bb7a1 Aug 18, 2023

facebook-github-bot deleted the gh/ezyang/2305/head branch August 22, 2023 14:16

angelayi mentioned this pull request Apr 1, 2024

[export] Add stack_trace for non-strict export #121034

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast traceback utilities #107358

Add fast traceback utilities #107358

ezyang commented Aug 17, 2023 •

edited

Loading

pytorch-bot bot commented Aug 17, 2023 •

edited

Loading

ezyang Aug 17, 2023

ezyang Aug 17, 2023

ezyang Aug 17, 2023

ezyang Aug 17, 2023

ezyang Aug 17, 2023

ezyang Aug 17, 2023

zdevito left a comment

ezyang commented Aug 18, 2023

albanD left a comment

albanD Aug 18, 2023

Add fast traceback utilities #107358

Add fast traceback utilities #107358

Conversation

ezyang commented Aug 17, 2023 • edited Loading

pytorch-bot bot commented Aug 17, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107358

❗ 1 Active SEVs

✅ No Failures

ezyang Aug 17, 2023

Choose a reason for hiding this comment

ezyang Aug 17, 2023

Choose a reason for hiding this comment

ezyang Aug 17, 2023

Choose a reason for hiding this comment

ezyang Aug 17, 2023

Choose a reason for hiding this comment

ezyang Aug 17, 2023

Choose a reason for hiding this comment

ezyang Aug 17, 2023

Choose a reason for hiding this comment

zdevito left a comment

Choose a reason for hiding this comment

ezyang commented Aug 18, 2023

albanD left a comment

Choose a reason for hiding this comment

albanD Aug 18, 2023

Choose a reason for hiding this comment

ezyang commented Aug 17, 2023 •

edited

Loading

pytorch-bot bot commented Aug 17, 2023 •

edited

Loading