-
Notifications
You must be signed in to change notification settings - Fork 22.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make int unspecialization actually work #95621
Conversation
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95621
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1142881: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Fixes #95469 Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: cd9a340cae68ee0ecf02f5df5556d413e89e9c7b Pull Request resolved: #96043
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Following CR at #95621 (comment) In fact, the block deleted is dead, because a MUST be a TensorVariable, and so it can never be a SymNodeVariable. Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 63697911ba0eb2f38b019f161b750eae855500dc Pull Request resolved: #96044
…eVariable" Following CR at #95621 (comment) In fact, the block deleted is dead, because a MUST be a TensorVariable, and so it can never be a SymNodeVariable. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Following CR at #95621 (comment) In fact, the block deleted is dead, because a MUST be a TensorVariable, and so it can never be a SymNodeVariable. Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: bd2220672d636f26eea571dc8d7a2b0af7898435 Pull Request resolved: #96043
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 #95621 (comment) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #96043 Approved by: https://github.com/Chillee, https://github.com/albanD
Specifically: https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196 pytorch/pytorch#95621 (comment) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch/pytorch#96043 Approved by: https://github.com/Chillee, https://github.com/albanD
OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor. The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors. * I folded in the changes from pytorch#95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.) * Only 0/1 integers get specialized by default now * A hodgepodge of fixes. I'll comment on the PR about them. Fixes pytorch#95469 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: pytorch#95621 Approved by: https://github.com/jansel, https://github.com/Chillee
…tional graph inputs (pytorch#96786) Summary: Pull Request resolved: pytorch#96786 Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621). However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Differential Revision: D44075910 fbshipit-source-id: 1e79caf59c74c6559e524e5a49bba057b2e2eb8b
…tional graph inputs (pytorch#96786) Summary: Pull Request resolved: pytorch#96786 Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621). However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Reviewed By: tugsbayasgalan Differential Revision: D44075910 fbshipit-source-id: 968562938c8ea1fd3e065e2ee162687cbb3112fc
…tional graph inputs Pull Request resolved: #96786 Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(#95621). However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Differential Revision: [D44075910](https://our.internmc.facebook.com/intern/diff/D44075910/) [ghstack-poisoned]
…tional graph inputs (pytorch#96786) Summary: Pull Request resolved: pytorch#96786 Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621). However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Reviewed By: tugsbayasgalan Differential Revision: D44075910 fbshipit-source-id: b7a0920eca34f25e5a5183242278219dbe5761eb
…tional graph inputs (#96786) Summary: Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(#95621). However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Differential Revision: D44075910 Pull Request resolved: #96786 Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
…tional graph inputs (#96786) Summary: Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch/pytorch#95621). However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Differential Revision: D44075910 Pull Request resolved: pytorch/pytorch#96786 Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
…tional graph inputs (#96786) Summary: Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch/pytorch#95621). However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs Test Plan: Example of the error message would look like: ``` Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps')) ``` Passed all export tests ``` Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26 Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257 RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8 Up: 6.0 KiB Down: 0 B Jobs completed: 101. Time elapsed: 115.7s. Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed ``` Differential Revision: D44075910 Pull Request resolved: pytorch/pytorch#96786 Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
This issue is in the milestones : https://github.com/pytorch/pytorch/milestone/36?closed=1, if you want to see your fix included in this minor release. Please post it as a cherry-pick into the [v2.0.1] Release Tracker. The deadline is April 14, 5PM PST. Only issues that have ‘cherry-picks’ will be considered for the release. Common FAQs: Q1: Where can I find more information on the release process and terminology? A: pytorch/RELEASE.md at master · pytorch/pytorch · GitHub Q2: Am I guaranteed to be included in the cherry-pick if I do above? A: No, it is not guaranteed, the Release Team will review all submissions against the listed criteria before making the final decision on what to include on 4/17. Q3: When is 2.1 going to be released? A: We do not have a formal date at this time but will update the community when we do. Our immediate focus is 2.0.1. Note that 1.12 was released on 6/28/22, 1.13 on 10/28/22 and 2.0 on 3/15/23. Q4: I missed the 4/14 5PM PST deadline, is there any option to have an extension? A: No, in order to meet our 4/28 goal, we must hold 4/14 as our deadline and will not accept any requests after the fact. We are over communicating the timelines and process with the community to avoid such issues. Q5: Where should I double check to see if my issue is in the cherry pick tracker? A: [v2.0.1] Release Tracker · Issue #97272 · pytorch/pytorch · GitHub Q6: Where can I find the Release Compatibility Matrix for PyTorch? A: pytorch/RELEASE.md at master · pytorch/pytorch · GitHub Please contact OSS Releng team members if you have any questions/comments. Again we appreciate everyone’s time and commitment to the community, PyTorch and 2.0 and 2.01 releases! Please refer to this post for more details: https://dev-discuss.pytorch.org/t/pytorch-release-2-0-1-important-information/1176 |
Stack from ghstack (oldest at bottom):
OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor.
The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors.
Fixes #95469
Signed-off-by: Edward Z. Yang ezyang@meta.com
cc @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire