Add data augmentation in LeRobotDataset #234

marinabar · 2024-05-31T14:31:09Z

What this does

Implements data augmentation for images a LeRobot dataset object.

Adds a custom RandomSubsetApply transform to apply a random subset of N transformations from a list of transformations.
Adds a custom SharpnessJitter transform to randomly change the sharpness of an image or video.
Adds an example 6_add_image_transforms.py showing usage of transforms parameter enabled with LeRobot Dataset.
Adds a visualize_image_transforms.py script to produce examples of transformed images that would be produced with a given config.
Adds tests and artifacts.

Default transforms are :

contrast
brightness
hue
saturation
sharpness

The parameters are taken from the default.yaml configuration and the transforms are defined in transforms.py. They are then applied in __getitem__() method of LeRobotDataset.
The transformation is applied on the images of all of the given cameras.
(WIP : support for multi image observations in delta_timestamps)

How you can verify it

To test various types of transforms, you can use the newly added script

python lerobot/scripts/visualize_image_transforms.py

This applies the transforms from the configuration file and then saves multiple images corresponding to each applied transform.

You can also run the example script:

python examples/6_add_image_transforms.py

How it was tested

We trained a model on the data collected on a grasping task from Reachy2, while incorporating data augmentation. On evaluation, the policy appeared more robust to lighting changes. (Evaluation ran in the dark with multiple sources of light)

Examples of single transformations:

`transformation`	`min_max`
original frame	`None`
`brightness`	`(0.5, 0.5)`
`brightness`	`(2.0, 2.0)`
`contrast`	`(0.5, 0.5)`
`contrast`	`(2.0, 2.0)`
`saturation`	`(0.5, 0.5)`
`saturation`	`(2.0, 2.0)`
`hue`	`(-0.25, 0.25)`
`hue`	`(0.25, 0.25)`
`sharpness`	`(0.5, 0.5)`
`sharpness`	`(2.0, 2.0)`

This change is

…ugmentation

Cadene · 2024-06-01T15:13:53Z

@marinabar Wonderful PR :)
Could you illustrate the augmentations by adding some screenshots to the PR description? Thanks!

lerobot/common/datasets/factory.py

Cadene · 2024-06-01T17:49:59Z

@marinabar Wonderful PR :) Could you illustrate the augmentations by adding some screenshots to the PR description? Thanks!

@marinabar Ideally we should show an image with the biggest augmentation for each transform (to better understand what each transform is doing).
Then we should show the worst case where all biggest augmentation of all transform are applied sequentially to an image (this is because your implementation can apply transforms sequentially).

Also, we should display the frames in the original scale. What you displayed is a bit too small and compressed to get a good feeling.

Ideally we could add a script in lerobot/scripts/show_image_transforms.py that could save these images in outputs/show_image_transforms. That's something we would need each time we add new transforms, so super useful!!!

Finally, we could add backward compatibility tests where we would save the result of specific frames, augmented with each transform. This is to ensure that if torchvision change something, we are aware of it. Here are two pointers for inspirations:

Data augmentation is really something we should be careful about since it can fail silently.

What do you think?

cc @aliberts @alexander-soare

…_augmentation' into 2024_05_30_add_data_augmentation

lerobot/configs/default.yaml

examples/6_show_image_transforms.py

lerobot/scripts/show_image_transforms.py

lerobot/common/datasets/lerobot_dataset.py

…ugmentation

Cadene

API and tests look great! Thanks Marina and Simon :)
One round of minor change and should be ready to merge.
Could you please ping @alexander-soare when it's done? THanks!

Cadene · 2024-06-10T15:34:36Z

lerobot/common/datasets/transforms.py

+class SharpnessJitter(Transform):
+    """Randomly change the sharpness of an image or video.
+    Similar to a v2.RandomAdjustSharpness with p=1 and a sharpness_factor sampled randomly.
+    A sharpness_factor of 0 gives a blurred image, 1 gives the original image while 2 increases the sharpness
+    by a factor of 2.
+
+    If the input is a :class:`torch.Tensor`,
+    it is expected to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
+
+    Args:
+        sharpness (float or tuple of float (min, max)): How much to jitter sharpness.
+            sharpness_factor is chosen uniformly from [max(0, 1 - sharpness), 1 + sharpness]
+            or the given [min, max]. Should be non negative numbers.
+    """


Could you add a comment to explain why we dont use RandomAdjustSharpness and copy past your change in this thread? Thanks!

Fixed 04adbd7

"""Randomly change the sharpness of an image or video. Similar to a v2.RandomAdjustSharpness with p=1 and a sharpness_factor sampled randomly. + While v2.RandomAdjustSharpness applies — with a given probability — a fixed sharpness_factor to an image, + SharpnessJitter applies a random sharpness_factor each time. This is to have a more diverse set of + augmentations as a result. + A sharpness_factor of 0 gives a blurred image, 1 gives the original image while 2 increases the sharpness by a factor of 2.

Cadene · 2024-06-10T15:36:58Z

lerobot/configs/default.yaml

@@ -57,3 +57,29 @@ wandb:
  disable_artifact: false
  project: lerobot
  notes: ""
+


Let's have them here ;)

Cadene · 2024-06-10T15:38:25Z

lerobot/scripts/visualize_image_transforms.py

+from pathlib import Path
+
+import hydra
+from torchvision.transforms import ToPILImage
+
+from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
+from lerobot.common.datasets.transforms import make_image_transforms
+
+to_pil = ToPILImage()
+
+
+def main(cfg, output_dir=Path("outputs/image_transforms")):


Could you provide a docstring in header of the script + example command?

Fixed 1fbb0d9

Cadene · 2024-06-10T15:38:51Z

lerobot/scripts/visualize_image_transforms.py

+        img = transform(frame)
+        to_pil(img).save(output_dir / f"{transform_name}.png", quality=100)
+


Could you print the output directory?

Fixed 540fb9c

Cadene · 2024-06-10T15:39:35Z

tests/scripts/save_image_transforms.py

@@ -0,0 +1,69 @@
+from pathlib import Path


Could we rename save_image_transforms_to_safetensors.py for consistency?

Fixed 1890637

Cadene · 2024-06-10T15:40:34Z

tests/scripts/save_image_transforms.py

+        kwargs = {
+            f"{transform}_weight": 1.0,
+            f"{transform}_min_max": (0.5, 0.5),
+        }
+        tf = get_image_transforms(**kwargs)
+        frames[transform] = tf(original_frame)


Should we save the two extreme values + mean value?

Done 9a3739d

Cadene · 2024-06-10T15:44:13Z

tests/test_transforms.py

@@ -0,0 +1,245 @@
+from pathlib import Path


Could you rename to test_image_transforms.py?

Fixed 1890637

lerobot/common/datasets/transforms.py

alexander-soare · 2024-06-10T16:45:00Z

lerobot/configs/default.yaml

@@ -57,3 +57,29 @@ wandb:
  disable_artifact: false
  project: lerobot
  notes: ""
+


Can we at least nest them under the training key but in this file? To make clear that they are not relevant to eval (for instance).

lerobot/configs/default.yaml

marina.barannikov@huggingface.co and others added 4 commits May 31, 2024 14:16

Implemented data augmentation with LeRobot class

65e46a4

Merge remote-tracking branch 'origin/main' into 2024_05_30_add_data_a…

20a3715

…ugmentation

Added data augmentation feature to MultiLeRobotDataset

c4870e5

Updated implementation on MultiLeRobotDataset

212a5ab

aliberts assigned marinabar and aliberts Jun 1, 2024

aliberts added ✨ Enhancement New feature or request 🗃️ Dataset Something dataset-related labels Jun 1, 2024

aliberts changed the title ~~Implemented data augmentation with LeRobot class~~ Add data augmentation in LeRobotDataset Jun 1, 2024

Cadene reviewed Jun 1, 2024

View reviewed changes

lerobot/common/datasets/factory.py Outdated Show resolved Hide resolved

marinabar and others added 16 commits June 3, 2024 14:18

Added clarification comments

9f8415f

Add RandomSubsetApply

602ea98

Updated default.yaml

cc4b3bd

Updated default.yaml

1429117

Merge branch 'huggingface:main' into 2024_05_30_add_data_augmentation

7be2c35

Updated config to match transforms

66629a9

Updated transforms arguments

42f9cc9

Added visualisations for image augmentation

5eea254

Merge remote-tracking branch 'refs/remotes/origin/2024_05_30_add_data…

31e3c82

…_augmentation' into 2024_05_30_add_data_augmentation

Updated formatting

22bd1f0

refactor show_image_transforms

443b06b

Redesign config

fdf56e7

Added example of torchvision image augmentation on LeRobotDataset

a544949

Merge branch 'huggingface:main' into 2024_05_30_add_data_augmentation

8b13472

Implement RandomSubsetApply features

6509c3f

Remove prints

ceb9559

Cadene reviewed Jun 5, 2024

View reviewed changes

lerobot/configs/default.yaml Outdated Show resolved Hide resolved

examples/6_show_image_transforms.py Outdated Show resolved Hide resolved

lerobot/scripts/show_image_transforms.py Outdated Show resolved Hide resolved

lerobot/common/datasets/lerobot_dataset.py Outdated Show resolved Hide resolved

Updated show_transform to match config

4dbc1ad

aliberts added 8 commits June 10, 2024 10:09

Merge remote-tracking branch 'origin/main' into 2024_05_30_add_data_a…

bb8af34

…ugmentation

save test_artifacts

625ea91

Add more tests

ec2a81f

Merge remote-tracking branch 'origin/main' into 2024_05_30_add_data_a…

9fba523

…ugmentation

Fix CI & update artifacts

a65b67f

Improve docstring

f57090f

Keep consistency with yaml

f0209af

Improve docstring

bde12f6

aliberts marked this pull request as ready for review June 10, 2024 14:24

aliberts requested review from Cadene and alexander-soare June 10, 2024 14:24

Cadene approved these changes Jun 10, 2024

View reviewed changes

alexander-soare approved these changes Jun 10, 2024

View reviewed changes

aliberts added 13 commits June 11, 2024 07:34

Nest image_transforms config under training

7a097e9

Explain difference with RandomAdjustSharpness

04adbd7

Rename files

1890637

Rename artifacts directory

5636dab

Parametrize more tests

cf24846

Fix file name & add copyrights

1e3187e

Refactor example

fd7f65b

Refactor visualize & add docstring

1fbb0d9

Remove type hints from docstrings

9ce1c7d

Clarify config doc

b6eb3e5

Print visualize output directories

540fb9c

Store more values in artifacts

9a3739d

Remove DATA_DIR from visualize command examples

85f0bd6

alexander-soare approved these changes Jun 11, 2024

View reviewed changes

aliberts merged commit ff8f6aa into huggingface:main Jun 11, 2024
5 checks passed

alexander-soare mentioned this pull request Jun 14, 2024

Seeking advice on how to choose between ACT and DP algorithms #263

Closed

marinabar deleted the 2024_05_30_add_data_augmentation branch June 14, 2024 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add data augmentation in LeRobotDataset #234

Add data augmentation in LeRobotDataset #234

marinabar commented May 31, 2024 •

edited by aliberts

Loading

Cadene commented Jun 1, 2024

Cadene commented Jun 1, 2024 •

edited

Loading

Cadene left a comment

Cadene Jun 10, 2024

aliberts Jun 11, 2024

Cadene Jun 10, 2024

Cadene Jun 10, 2024

aliberts Jun 11, 2024

Cadene Jun 10, 2024

aliberts Jun 11, 2024

Cadene Jun 10, 2024

aliberts Jun 11, 2024

Cadene Jun 10, 2024

aliberts Jun 11, 2024

Cadene Jun 10, 2024

aliberts Jun 11, 2024

alexander-soare Jun 10, 2024

		img = transform(frame)
		to_pil(img).save(output_dir / f"{transform_name}.png", quality=100)

Add data augmentation in LeRobotDataset #234

Add data augmentation in LeRobotDataset #234

Conversation

marinabar commented May 31, 2024 • edited by aliberts Loading

What this does

How you can verify it

How it was tested

Cadene commented Jun 1, 2024

Cadene commented Jun 1, 2024 • edited Loading

Cadene left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marinabar commented May 31, 2024 •

edited by aliberts

Loading

Cadene commented Jun 1, 2024 •

edited

Loading