-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export ClipImageTransform #1269
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1269
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c57cf35 with merge base f1db074 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
752535c
to
29725b9
Compare
b59fbe0
to
00c9681
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I will leave the final word to @pbontrager. Maybe it would be nice to add a small docstring at the top of "torchtune/models/clip/inference/_transforms.py" explaining that this is for executorch only. But I am fine with leaving it as it is.
@@ -11,7 +11,10 @@ | |||
import torch | |||
from tests.test_utils import assert_expected | |||
|
|||
from torchtune.models.clip._transforms import CLIPImageTransform | |||
from torchtune.models.clip._transforms import CLIPImageTransform as CLIPImageTransform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the last part "as CLIPImageTransform"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also updated - @tarun292 found we can swap out functionals as long as we 'name' them inside init(), so leaving torch.nn.functional.pad, and tile_crop as is.
6cdc95a
to
51db024
Compare
Thanks for reviewing! Added a note that being able to export is useful for ExecuTorch/AOTI. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1269 +/- ##
===========================================
- Coverage 70.55% 27.09% -43.46%
===========================================
Files 255 258 +3
Lines 11771 11872 +101
===========================================
- Hits 8305 3217 -5088
- Misses 3466 8655 +5189 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking nice and very clean. I really like the docstring for the new transform as well. I left some comments around naming and descriptions but mostly nits. I think the get_resize_dimensions should be added to the docs, but I'm fine with leaving the inference transform out of the docs for now as we figure out our story there.
logger = logging.getLogger(__name__) | ||
|
||
|
||
def get_resize_dimensions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe get_inscribed_size
would be more descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good - updated
>>> max_upscaling_size = None | ||
>>> image_size = (300, 800) | ||
>>> target_size = (448, 1344) | ||
>>> output = get_resize_dimensions(image_size, target_size, resample, max_upscaling_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples have 4 inputs instead of 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch - updated
Args: | ||
image_size (Tuple[int]): The size of the image, in the form [height, width]. | ||
target_size (Tuple[int]): The desired resolution to fit the image into, in the format [height, width]. | ||
max_upscaling_size (Optional[int]): The maximum size to upscale the image to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just max_size
. And in the description mention it's the max for the largest dim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - I'll change this in the other transforms as well so it's consistent.
image_size: Tuple[int], target_size: Tuple[int], max_upscaling_size: Optional[int] | ||
) -> Tuple[int]: | ||
""" | ||
Calculates the resize dimensions for an image, given the target size and max_upscaling_size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be fleshed out a bit more. The key thing this function does is computes what size an image would be if it was resized to be inscribed within a target size. So it's upscaled or downscaled such that one size equals the target size and the second side is less than or equal to the target size. I think some description describing this would help make the function easier to discover and use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, updated
logger = logging.getLogger(__name__) | ||
|
||
|
||
def get_resize_dimensions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be added to the docs RST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
>>> target_size = (448, 1344) | ||
>>> output = get_resize_dimensions(image_size, target_size, resample, max_upscaling_size) | ||
|
||
Example 2: The image will stay as is, since 800 > 600. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felipemello1 I find it counterintuitive that this doesn't cause a downsize to (225, 600). max_upscaling_size is a bit weird to have here at all because you could just not provide a target that's too big, but I understand it's convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking nice and very clean. I really like the docstring for the new transform as well. I left some comments around naming and descriptions but mostly nits. I think the get_resize_dimensions should be added to the docs, but I'm fine with leaving the inference transform out of the docs for now as we figure out our story there.
51db024
to
47545de
Compare
Summary: Pull Request resolved: pytorch#4548 For preprocess, we have two custom ops, pad and reshape. These are used to hide the implementation and any data dependent issues from export, allowing us to export the model. The model definition is in D60051533, though we're trying to upstream it to torchtune: pytorch/torchtune#1269 This diff adds: - Python impl + tests - C++ function signatures + buck setup for reshape using [EXECUTORCH_LIBRARY](https://pytorch.org/executorch/stable/kernel-library-custom-aten-kernel.html#custom-ops-c-api) Differential Revision: D60491675
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work! I'll approve this now, but the test_get_resize_dimensions test should be renamed before you land it.
# This source code is licensed under the BSD-style license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test filename needs to be updated
a388b27
to
b694b22
Compare
b694b22
to
c57cf35
Compare
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
Changelog
get_resize_dimensions
, add tests.Note: we cannot add a test for the exported model, as source transforms are required on the nn.Modules to successfully export it.
Test plan
Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.)
pre-commit install
)pytest tests
pytest tests -m integration_test