[optim] Exclude timm_vision_transformer pt2, fix runtime errors in 1888 #1890
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
we replace timm_vision_transformer with its large variant weeks ago but kept the data just to have continuous trends. I had planned to remove it in a few weeks anyway. Moreover, in the latest few nightlies, timm_vision_transformer had lots of ups and downs randomly, obscuring our trends, so it is worth denoising.
This also excludes simple_gpt from the optim benchmark suite as it could only be run in dynamo bench, and excludes nadam for single tensor compilation with timm_vision_transformer_large.