Skip to content

Commit

Permalink
Merge branch 'main' into center-expand
Browse files Browse the repository at this point in the history
  • Loading branch information
pmeier committed Aug 31, 2023
2 parents eb451e3 + b828671 commit 601565b
Show file tree
Hide file tree
Showing 19 changed files with 521 additions and 460 deletions.
45 changes: 28 additions & 17 deletions docs/source/transforms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,20 +33,33 @@ tasks (image classification, detection, segmentation, video classification).
from torchvision import tv_tensors
img = torch.randint(0, 256, size=(3, H, W), dtype=torch.uint8)
bboxes = torch.randint(0, H // 2, size=(3, 4))
bboxes[:, 2:] += bboxes[:, :2]
bboxes = tv_tensors.BoundingBoxes(bboxes, format="XYXY", canvas_size=(H, W))
boxes = torch.randint(0, H // 2, size=(3, 4))
boxes[:, 2:] += boxes[:, :2]
boxes = tv_tensors.BoundingBoxes(boxes, format="XYXY", canvas_size=(H, W))
# The same transforms can be used!
img, bboxes = transforms(img, bboxes)
img, boxes = transforms(img, boxes)
# And you can pass arbitrary input structures
output_dict = transforms({"image": img, "bboxes": bboxes})
output_dict = transforms({"image": img, "boxes": boxes})
Transforms are typically passed as the ``transform`` or ``transforms`` argument
to the :ref:`Datasets <datasets>`.

.. TODO: Reader guide, i.e. what to read depending on what you're looking for
.. TODO: add link to getting started guide here.
Start here
----------

Whether you're new to Torchvision transforms, or you're already experienced with
them, we encourage you to start with
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py` in
order to learn more about what can be done with the new v2 transforms.

Then, browse the sections in below this page for general information and
performance tips. The available transforms and functionals are listed in the
:ref:`API reference <v2_api_ref>`.

More information and tutorials can also be found in our :ref:`example gallery
<gallery>`, e.g. :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`
or :ref:`sphx_glr_auto_examples_transforms_plot_custom_transforms.py`.

.. _conventions:

Expand Down Expand Up @@ -98,25 +111,21 @@ advantages compared to the v1 ones (in ``torchvision.transforms``):

- They can transform images **but also** bounding boxes, masks, or videos. This
provides support for tasks beyond image classification: detection, segmentation,
video classification, etc.
video classification, etc. See
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py`
and :ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`.
- They support more transforms like :class:`~torchvision.transforms.v2.CutMix`
and :class:`~torchvision.transforms.v2.MixUp`.
and :class:`~torchvision.transforms.v2.MixUp`. See
:ref:`sphx_glr_auto_examples_transforms_plot_cutmix_mixup.py`.
- They're :ref:`faster <transforms_perf>`.
- They support arbitrary input structures (dicts, lists, tuples, etc.).
- Future improvements and features will be added to the v2 transforms only.

.. TODO: Add link to e2e example for first bullet point.
These transforms are **fully backward compatible** with the v1 ones, so if
you're already using tranforms from ``torchvision.transforms``, all you need to
do to is to update the import to ``torchvision.transforms.v2``. In terms of
output, there might be negligible differences due to implementation differences.

To learn more about the v2 transforms, check out
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py`.

.. TODO: make sure link is still good!!
.. note::

The v2 transforms are still BETA, but at this point we do not expect
Expand Down Expand Up @@ -184,7 +193,7 @@ This is very much like the :mod:`torch.nn` package which defines both classes
and functional equivalents in :mod:`torch.nn.functional`.

The functionals support PIL images, pure tensors, or :ref:`TVTensors
<tv_tensors>`, e.g. both ``resize(image_tensor)`` and ``resize(bboxes)`` are
<tv_tensors>`, e.g. both ``resize(image_tensor)`` and ``resize(boxes)`` are
valid.

.. note::
Expand Down Expand Up @@ -248,6 +257,8 @@ be derived from ``torch.nn.Module``.

See also: :ref:`sphx_glr_auto_examples_others_plot_scripted_tensor_transforms.py`.

.. _v2_api_ref:

V2 API reference - Recommended
------------------------------

Expand Down
10 changes: 7 additions & 3 deletions docs/source/tv_tensors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,13 @@ TVTensors

TVTensors are :class:`torch.Tensor` subclasses which the v2 :ref:`transforms
<transforms>` use under the hood to dispatch their inputs to the appropriate
lower-level kernels. Most users do not need to manipulate TVTensors directly and
can simply rely on dataset wrapping - see e.g.
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_e2e.py`.
lower-level kernels. Most users do not need to manipulate TVTensors directly.

Refer to
:ref:`sphx_glr_auto_examples_transforms_plot_transforms_getting_started.py` for
an introduction to TVTensors, or
:ref:`sphx_glr_auto_examples_transforms_plot_tv_tensors.py` for more advanced
info.

.. autosummary::
:toctree: generated/
Expand Down
2 changes: 2 additions & 0 deletions gallery/README.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
.. _gallery:

Examples and tutorials
======================
13 changes: 13 additions & 0 deletions gallery/transforms/plot_transforms_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,3 +166,16 @@
print(f"{[type(target) for target in targets] = }")
for name, loss_val in loss_dict.items():
print(f"{name:<20}{loss_val:.3f}")

# %%
# Training References
# -------------------
#
# From there, you can check out the `torchvision references
# <https://github.com/pytorch/vision/tree/main/references>`_ where you'll find
# the actual training scripts we use to train our models.
#
# **Disclaimer** The code in our references is more complex than what you'll
# need for your own use-cases: this is because we're supporting different
# backends (PIL, tensors, TVTensors) and different transforms namespaces (v1 and
# v2). So don't be afraid to simplify and only keep what you need.
2 changes: 2 additions & 0 deletions gallery/transforms/plot_transforms_getting_started.py
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,8 @@
# can still be transformed by some transforms like
# :class:`~torchvision.transforms.v2.SanitizeBoundingBoxes`!).
#
# .. _transforms_datasets_intercompatibility:
#
# Transforms and Datasets intercompatibility
# ------------------------------------------
#
Expand Down
3 changes: 3 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,9 @@ def get_extensions():
extra_compile_args["nvcc"] = [f for f in nvcc_flags if not ("-O" in f or "-g" in f)]
extra_compile_args["nvcc"].append("-O0")
extra_compile_args["nvcc"].append("-g")
else:
print("Compiling with debug mode OFF")
extra_compile_args["cxx"].append("-g0")

sources = [os.path.join(extensions_dir, s) for s in sources]

Expand Down
154 changes: 4 additions & 150 deletions test/test_transforms_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -449,99 +449,6 @@ def test__get_params(self, fill, side_range):
assert 0 <= params["padding"][3] <= (side_range[1] - 1) * h


class TestRandomCrop:
def test_assertions(self):
with pytest.raises(ValueError, match="Please provide only two dimensions"):
transforms.RandomCrop([10, 12, 14])

with pytest.raises(TypeError, match="Got inappropriate padding arg"):
transforms.RandomCrop([10, 12], padding="abc")

with pytest.raises(ValueError, match="Padding must be an int or a 1, 2, or 4"):
transforms.RandomCrop([10, 12], padding=[-0.7, 0, 0.7])

with pytest.raises(TypeError, match="Got inappropriate fill arg"):
transforms.RandomCrop([10, 12], padding=1, fill="abc")

with pytest.raises(ValueError, match="Padding mode should be either"):
transforms.RandomCrop([10, 12], padding=1, padding_mode="abc")

@pytest.mark.parametrize("padding", [None, 1, [2, 3], [1, 2, 3, 4]])
@pytest.mark.parametrize("size, pad_if_needed", [((10, 10), False), ((50, 25), True)])
def test__get_params(self, padding, pad_if_needed, size):
h, w = size = (24, 32)
image = make_image(size)

transform = transforms.RandomCrop(size, padding=padding, pad_if_needed=pad_if_needed)
params = transform._get_params([image])

if padding is not None:
if isinstance(padding, int):
pad_top = pad_bottom = pad_left = pad_right = padding
elif isinstance(padding, list) and len(padding) == 2:
pad_left = pad_right = padding[0]
pad_top = pad_bottom = padding[1]
elif isinstance(padding, list) and len(padding) == 4:
pad_left, pad_top, pad_right, pad_bottom = padding

h += pad_top + pad_bottom
w += pad_left + pad_right
else:
pad_left = pad_right = pad_top = pad_bottom = 0

if pad_if_needed:
if w < size[1]:
diff = size[1] - w
pad_left += diff
pad_right += diff
w += 2 * diff
if h < size[0]:
diff = size[0] - h
pad_top += diff
pad_bottom += diff
h += 2 * diff

padding = [pad_left, pad_top, pad_right, pad_bottom]

assert 0 <= params["top"] <= h - size[0] + 1
assert 0 <= params["left"] <= w - size[1] + 1
assert params["height"] == size[0]
assert params["width"] == size[1]
assert params["needs_pad"] is any(padding)
assert params["padding"] == padding


class TestGaussianBlur:
def test_assertions(self):
with pytest.raises(ValueError, match="Kernel size should be a tuple/list of two integers"):
transforms.GaussianBlur([10, 12, 14])

with pytest.raises(ValueError, match="Kernel size value should be an odd and positive number"):
transforms.GaussianBlur(4)

with pytest.raises(
TypeError, match="sigma should be a single int or float or a list/tuple with length 2 floats."
):
transforms.GaussianBlur(3, sigma=[1, 2, 3])

with pytest.raises(ValueError, match="If sigma is a single number, it must be positive"):
transforms.GaussianBlur(3, sigma=-1.0)

with pytest.raises(ValueError, match="sigma values should be positive and of the form"):
transforms.GaussianBlur(3, sigma=[2.0, 1.0])

@pytest.mark.parametrize("sigma", [10.0, [10.0, 12.0]])
def test__get_params(self, sigma):
transform = transforms.GaussianBlur(3, sigma=sigma)
params = transform._get_params([])

if isinstance(sigma, float):
assert params["sigma"][0] == params["sigma"][1] == 10
else:
assert sigma[0] <= params["sigma"][0] <= sigma[1]
assert sigma[0] <= params["sigma"][1] <= sigma[1]


class TestRandomPerspective:
def test_assertions(self):
with pytest.raises(ValueError, match="Argument distortion_scale value should be between 0 and 1"):
Expand All @@ -565,24 +472,18 @@ def test__get_params(self):
class TestElasticTransform:
def test_assertions(self):

with pytest.raises(TypeError, match="alpha should be float or a sequence of floats"):
with pytest.raises(TypeError, match="alpha should be a number or a sequence of numbers"):
transforms.ElasticTransform({})

with pytest.raises(ValueError, match="alpha is a sequence its length should be one of 2"):
with pytest.raises(ValueError, match="alpha is a sequence its length should be 1 or 2"):
transforms.ElasticTransform([1.0, 2.0, 3.0])

with pytest.raises(ValueError, match="alpha should be a sequence of floats"):
transforms.ElasticTransform([1, 2])

with pytest.raises(TypeError, match="sigma should be float or a sequence of floats"):
with pytest.raises(TypeError, match="sigma should be a number or a sequence of numbers"):
transforms.ElasticTransform(1.0, {})

with pytest.raises(ValueError, match="sigma is a sequence its length should be one of 2"):
with pytest.raises(ValueError, match="sigma is a sequence its length should be 1 or 2"):
transforms.ElasticTransform(1.0, [1.0, 2.0, 3.0])

with pytest.raises(ValueError, match="sigma should be a sequence of floats"):
transforms.ElasticTransform(1.0, [1, 2])

with pytest.raises(TypeError, match="Got inappropriate fill arg"):
transforms.ElasticTransform(1.0, 2.0, fill="abc")

Expand All @@ -602,53 +503,6 @@ def test__get_params(self):
assert (-alpha / h <= displacement[0, ..., 1]).all() and (displacement[0, ..., 1] <= alpha / h).all()


class TestRandomErasing:
def test_assertions(self):
with pytest.raises(TypeError, match="Argument value should be either a number or str or a sequence"):
transforms.RandomErasing(value={})

with pytest.raises(ValueError, match="If value is str, it should be 'random'"):
transforms.RandomErasing(value="abc")

with pytest.raises(TypeError, match="Scale should be a sequence"):
transforms.RandomErasing(scale=123)

with pytest.raises(TypeError, match="Ratio should be a sequence"):
transforms.RandomErasing(ratio=123)

with pytest.raises(ValueError, match="Scale should be between 0 and 1"):
transforms.RandomErasing(scale=[-1, 2])

image = make_image((24, 32))

transform = transforms.RandomErasing(value=[1, 2, 3, 4])

with pytest.raises(ValueError, match="If value is a sequence, it should have either a single value"):
transform._get_params([image])

@pytest.mark.parametrize("value", [5.0, [1, 2, 3], "random"])
def test__get_params(self, value):
image = make_image((24, 32))
num_channels, height, width = F.get_dimensions(image)

transform = transforms.RandomErasing(value=value)
params = transform._get_params([image])

v = params["v"]
h, w = params["h"], params["w"]
i, j = params["i"], params["j"]
assert isinstance(v, torch.Tensor)
if value == "random":
assert v.shape == (num_channels, h, w)
elif isinstance(value, (int, float)):
assert v.shape == (1, 1, 1)
elif isinstance(value, (list, tuple)):
assert v.shape == (num_channels, 1, 1)

assert 0 <= i <= height - h
assert 0 <= j <= width - w


class TestTransform:
@pytest.mark.parametrize(
"inpt_type",
Expand Down
Loading

0 comments on commit 601565b

Please sign in to comment.