Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export ClipImageTransform #1269

Merged
merged 1 commit into from
Aug 9, 2024
Merged

Conversation

lucylq
Copy link
Contributor

@lucylq lucylq commented Aug 6, 2024

Context

What is the purpose of this PR? Is it to

  • add a new feature
  • fix a bug
  • update tests and/or documentation
  • other (please add here)

Please link to any issues this PR addresses.

Changelog

  • Add torchtune/models/clip/inference/_transforms.py, a version of CLIPImageTransform that can be exported. Update transforms tests with a comparison between the two CLIPImageTransforms.
  • Refactor target_size calculations into function get_resize_dimensions, add tests.
  • Change tile_crop from a function to an nn.Module for module swapping, update callsites.

Note: we cannot add a test for the exported model, as source transforms are required on the nn.Modules to successfully export it.

Test plan

Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.)

  • run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
  • add unit tests for any new functionality
  • update docstrings for any new or updated methods or classes
  • run unit tests via pytest tests
  • run recipe tests via pytest tests -m integration_test
  • manually run any new or modified recipes with sufficient proof of correctness
  • include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

Copy link

pytorch-bot bot commented Aug 6, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1269

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c57cf35 with merge base f1db074 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 6, 2024
@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch 2 times, most recently from 752535c to 29725b9 Compare August 6, 2024 16:14
@lucylq lucylq marked this pull request as ready for review August 6, 2024 16:16
@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch 4 times, most recently from b59fbe0 to 00c9681 Compare August 6, 2024 16:29
@lucylq lucylq mentioned this pull request Aug 6, 2024
9 tasks
Copy link
Contributor

@felipemello1 felipemello1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I will leave the final word to @pbontrager. Maybe it would be nice to add a small docstring at the top of "torchtune/models/clip/inference/_transforms.py" explaining that this is for executorch only. But I am fine with leaving it as it is.

@@ -11,7 +11,10 @@
import torch
from tests.test_utils import assert_expected

from torchtune.models.clip._transforms import CLIPImageTransform
from torchtune.models.clip._transforms import CLIPImageTransform as CLIPImageTransform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the last part "as CLIPImageTransform"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. Updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updated - @tarun292 found we can swap out functionals as long as we 'name' them inside init(), so leaving torch.nn.functional.pad, and tile_crop as is.

@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch 5 times, most recently from 6cdc95a to 51db024 Compare August 6, 2024 21:56
@lucylq
Copy link
Contributor Author

lucylq commented Aug 6, 2024

LGTM, I will leave the final word to @pbontrager. Maybe it would be nice to add a small docstring at the top of "torchtune/models/clip/inference/_transforms.py" explaining that this is for executorch only. But I am fine with leaving it as it is.

Thanks for reviewing! Added a note that being able to export is useful for ExecuTorch/AOTI.

@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2024

Codecov Report

Attention: Patch coverage is 25.71429% with 78 lines in your changes missing coverage. Please review.

Project coverage is 27.09%. Comparing base (18962f3) to head (c57cf35).
Report is 2 commits behind head on main.

Files Patch % Lines
torchtune/models/clip/inference/_transforms.py 24.59% 46 Missing ⚠️
...ules/transforms/vision_utils/get_inscribed_size.py 29.41% 12 Missing ⚠️
...torchtune/models/clip/test_clip_image_transform.py 12.50% 7 Missing ⚠️
...tune/modules/transforms/test_get_inscribed_size.py 45.45% 6 Missing ⚠️
torchtune/models/clip/_transforms.py 0.00% 3 Missing ⚠️
...modules/transforms/vision_utils/resize_with_pad.py 0.00% 3 Missing ⚠️
...rchtune/modules/transforms/test_resize_with_pad.py 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (18962f3) and HEAD (c57cf35). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (18962f3) HEAD (c57cf35)
4 2
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1269       +/-   ##
===========================================
- Coverage   70.55%   27.09%   -43.46%     
===========================================
  Files         255      258        +3     
  Lines       11771    11872      +101     
===========================================
- Hits         8305     3217     -5088     
- Misses       3466     8655     +5189     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking nice and very clean. I really like the docstring for the new transform as well. I left some comments around naming and descriptions but mostly nits. I think the get_resize_dimensions should be added to the docs, but I'm fine with leaving the inference transform out of the docs for now as we figure out our story there.

logger = logging.getLogger(__name__)


def get_resize_dimensions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get_inscribed_size would be more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good - updated

>>> max_upscaling_size = None
>>> image_size = (300, 800)
>>> target_size = (448, 1344)
>>> output = get_resize_dimensions(image_size, target_size, resample, max_upscaling_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples have 4 inputs instead of 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - updated

Args:
image_size (Tuple[int]): The size of the image, in the form [height, width].
target_size (Tuple[int]): The desired resolution to fit the image into, in the format [height, width].
max_upscaling_size (Optional[int]): The maximum size to upscale the image to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just max_size. And in the description mention it's the max for the largest dim.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I'll change this in the other transforms as well so it's consistent.

image_size: Tuple[int], target_size: Tuple[int], max_upscaling_size: Optional[int]
) -> Tuple[int]:
"""
Calculates the resize dimensions for an image, given the target size and max_upscaling_size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be fleshed out a bit more. The key thing this function does is computes what size an image would be if it was resized to be inscribed within a target size. So it's upscaled or downscaled such that one size equals the target size and the second side is less than or equal to the target size. I think some description describing this would help make the function easier to discover and use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated

logger = logging.getLogger(__name__)


def get_resize_dimensions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be added to the docs RST

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

>>> target_size = (448, 1344)
>>> output = get_resize_dimensions(image_size, target_size, resample, max_upscaling_size)

Example 2: The image will stay as is, since 800 > 600.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipemello1 I find it counterintuitive that this doesn't cause a downsize to (225, 600). max_upscaling_size is a bit weird to have here at all because you could just not provide a target that's too big, but I understand it's convenient.

Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking nice and very clean. I really like the docstring for the new transform as well. I left some comments around naming and descriptions but mostly nits. I think the get_resize_dimensions should be added to the docs, but I'm fine with leaving the inference transform out of the docs for now as we figure out our story there.

@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch from 51db024 to 47545de Compare August 7, 2024 20:45
lucylq added a commit to lucylq/executorch-1 that referenced this pull request Aug 8, 2024
Summary:
Pull Request resolved: pytorch#4548

For preprocess, we have two custom ops, pad and reshape.
These are used to hide the implementation and any data dependent issues from export, allowing us to export the model.

The model definition is in D60051533, though we're trying to upstream it to torchtune: pytorch/torchtune#1269

This diff adds:
- Python impl + tests
- C++ function signatures + buck setup for reshape using [EXECUTORCH_LIBRARY](https://pytorch.org/executorch/stable/kernel-library-custom-aten-kernel.html#custom-ops-c-api)

Differential Revision: D60491675
Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work! I'll approve this now, but the test_get_resize_dimensions test should be renamed before you land it.

# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

import pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test filename needs to be updated

@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch 6 times, most recently from a388b27 to b694b22 Compare August 9, 2024 17:16
@lucylq lucylq force-pushed the lfq.export-preprocess-2 branch from b694b22 to c57cf35 Compare August 9, 2024 17:17
@felipemello1 felipemello1 merged commit c3851fd into pytorch:main Aug 9, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants