Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make int unspecialization actually work #95621

Closed
wants to merge 20 commits into from

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Feb 27, 2023

Stack from ghstack (oldest at bottom):

OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor.

The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors.

  • I folded in the changes from rationalize specialize_int_float #95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.)
  • Only 0/1 integers get specialized by default now
  • A hodgepodge of fixes. I'll comment on the PR about them.

Fixes #95469

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

Fixes #95469

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 27, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95621

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1142881:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Feb 27, 2023
Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: c685677f8b98161f4e224bbbe475bd932685f956
Pull Request resolved: #95621
Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
Fixes #95469

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
@ezyang ezyang changed the title Reduce number of common integer constants Make int unspecialization actually work Feb 28, 2023
@ezyang ezyang requested a review from jansel February 28, 2023 15:45
ezyang added a commit that referenced this pull request Mar 8, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: cd9a340cae68ee0ecf02f5df5556d413e89e9c7b
Pull Request resolved: #96043
ezyang added a commit that referenced this pull request Mar 8, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Mar 8, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Mar 8, 2023
Following CR at
#95621 (comment)

In fact, the block deleted is dead, because a MUST be a TensorVariable,
and so it can never be a SymNodeVariable.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: 63697911ba0eb2f38b019f161b750eae855500dc
Pull Request resolved: #96044
ezyang added a commit that referenced this pull request Mar 8, 2023
…eVariable"

Following CR at
#95621 (comment)

In fact, the block deleted is dead, because a MUST be a TensorVariable,
and so it can never be a SymNodeVariable.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Mar 8, 2023
Following CR at
#95621 (comment)

In fact, the block deleted is dead, because a MUST be a TensorVariable,
and so it can never be a SymNodeVariable.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 9, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 9, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Mar 9, 2023
Specifically:
https://github.com/pytorch/pytorch/pull/95621/files/063e44147152f4dd7e51852cf8c679692bd9fd53#r1120306196
#95621 (comment)

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: bd2220672d636f26eea571dc8d7a2b0af7898435
Pull Request resolved: #96043
ydwu4 added a commit to ydwu4/pytorch that referenced this pull request Mar 13, 2023
OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor.

The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors.

* I folded in the changes from pytorch#95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.)
* Only 0/1 integers get specialized by default now
* A hodgepodge of fixes. I'll comment on the PR about them.

Fixes pytorch#95469

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: pytorch#95621
Approved by: https://github.com/jansel, https://github.com/Chillee
guangy10 added a commit to guangy10/pytorch that referenced this pull request Mar 15, 2023
…tional graph inputs (pytorch#96786)

Summary:
Pull Request resolved: pytorch#96786

Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621).

However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Differential Revision: D44075910

fbshipit-source-id: 1e79caf59c74c6559e524e5a49bba057b2e2eb8b
guangy10 added a commit to guangy10/pytorch that referenced this pull request Mar 16, 2023
…tional graph inputs (pytorch#96786)

Summary:
Pull Request resolved: pytorch#96786

Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621).

However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Reviewed By: tugsbayasgalan

Differential Revision: D44075910

fbshipit-source-id: 968562938c8ea1fd3e065e2ee162687cbb3112fc
guangy10 pushed a commit that referenced this pull request Mar 16, 2023
…tional graph inputs

Pull Request resolved: #96786

Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(#95621).

However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Differential Revision: [D44075910](https://our.internmc.facebook.com/intern/diff/D44075910/)

[ghstack-poisoned]
guangy10 added a commit to guangy10/pytorch that referenced this pull request Mar 17, 2023
…tional graph inputs (pytorch#96786)

Summary:
Pull Request resolved: pytorch#96786

Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch#95621).

However, with pytorch#95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Reviewed By: tugsbayasgalan

Differential Revision: D44075910

fbshipit-source-id: b7a0920eca34f25e5a5183242278219dbe5761eb
pytorchmergebot pushed a commit that referenced this pull request Mar 17, 2023
…tional graph inputs (#96786)

Summary:
Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(#95621).

However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Differential Revision: D44075910

Pull Request resolved: #96786
Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
@ezyang ezyang added this to the 2.0.1 milestone Mar 20, 2023
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 23, 2023
…tional graph inputs (#96786)

Summary:
Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch/pytorch#95621).

However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Differential Revision: D44075910

Pull Request resolved: pytorch/pytorch#96786
Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 27, 2023
…tional graph inputs (#96786)

Summary:
Verified the changes to catch unspecialized int/floats being added as additional graph in D44037548 prior to RP(pytorch/pytorch#95621).

However, with #95621 the issue to be solved originally is no longer valid because int & float in `forward` will always be specialized in export. This RP is to add the assertion anyway *(though not be hit unless there is a regression)* to immediately catch the attempt to add unspecialized int/float to additional graphargs

Test Plan:
Example of the error message would look like:
```
Dynamo attempts to add additional input: value=9.999999747378752e-06, source=NNModuleSource(inner=AttrSource(base=NNModuleSource(inner=AttrSource(base=LocalInputSource(local_name='self', pos=0), member='torch_module')), member='eps'))
```
Passed all export tests
```
Buck UI: https://www.internalfb.com/buck2/fea72653-5549-47e7-a9bf-740eb86a8e26
Test UI: https://www.internalfb.com/intern/testinfra/testrun/8725724422167257
RE: reSessionID-7b3470b1-c293-4c4a-9671-dd0b7a2839b8  Up: 6.0 KiB  Down: 0 B
Jobs completed: 101. Time elapsed: 115.7s.
Tests finished: Pass 98. Fail 0. Fatal 0. Skip 0. 0 builds failed
```

Differential Revision: D44075910

Pull Request resolved: pytorch/pytorch#96786
Approved by: https://github.com/tugsbayasgalan, https://github.com/ezyang
@atalman
Copy link
Contributor

atalman commented Apr 4, 2023

cc @guangy10 @ezyang

This issue is in the milestones : https://github.com/pytorch/pytorch/milestone/36?closed=1, if you want to see your fix included in this minor release. Please post it as a cherry-pick into the [v2.0.1] Release Tracker.

The deadline is April 14, 5PM PST.

Only issues that have ‘cherry-picks’ will be considered for the release.

Common FAQs:

Q1: Where can I find more information on the release process and terminology?

A: pytorch/RELEASE.md at master · pytorch/pytorch · GitHub

Q2: Am I guaranteed to be included in the cherry-pick if I do above?

A: No, it is not guaranteed, the Release Team will review all submissions against the listed criteria before making the final decision on what to include on 4/17.

Q3: When is 2.1 going to be released?

A: We do not have a formal date at this time but will update the community when we do. Our immediate focus is 2.0.1. Note that 1.12 was released on 6/28/22, 1.13 on 10/28/22 and 2.0 on 3/15/23.

Q4: I missed the 4/14 5PM PST deadline, is there any option to have an extension?

A: No, in order to meet our 4/28 goal, we must hold 4/14 as our deadline and will not accept any requests after the fact. We are over communicating the timelines and process with the community to avoid such issues.

Q5: Where should I double check to see if my issue is in the cherry pick tracker?

A: [v2.0.1] Release Tracker · Issue #97272 · pytorch/pytorch · GitHub

Q6: Where can I find the Release Compatibility Matrix for PyTorch?

A: pytorch/RELEASE.md at master · pytorch/pytorch · GitHub

Please contact OSS Releng team members if you have any questions/comments. Again we appreciate everyone’s time and commitment to the community, PyTorch and 2.0 and 2.01 releases!

Please refer to this post for more details: https://dev-discuss.pytorch.org/t/pytorch-release-2-0-1-important-information/1176

@ezyang ezyang removed this from the 2.0.1 milestone Apr 21, 2023
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1855/head branch June 8, 2023 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants