Callback function to log Masked Autoencoder reconstructions to WandB #88

weiji14 · 2023-12-13T05:18:13Z

To visually inspect how the Masked Autoencoder is performing over the training run in terms of reconstructing to original image.

Implemented as a Lightning Callback function which runs at the end of a validation loop's mini-batch (on_validation_batch_end). A sample of 6 image pairs (original + reconstructed, so total of 12) are uploaded online.

Example usage with LightningCLI:

Samples of histogram equalized RGB images, and the reconstructed outputs (only random noise as this is early in the training).

python trainer.py fit --trainer.max_epochs=2 \
                      --trainer.precision=bf16-mixed \
                      --data.data_path=data/32VLM \
                      --data.num_workers=4 \
                      --trainer.logger=WandbLogger \
                      --trainer.logger.project=clay \
                      --trainer.logger.save_dir=checkpoints \
                      --trainer.callbacks+=LogMAEReconstruction \

TODO:

Add wandb dependency
Initial implementation to upload RGB images
Do proper histogram equalization of the images
Add a unit test

TODO in the future:

Upload SAR and DEM images

References:

A CLI and library for interacting with the Weights and Biases API!

Created a custom callback function to log visualizations of the input and output images to the Masked Autoencoder. Only showing the RGB bands of Sentinel-2 for now. A sample of 6 image pairs (original + reconstructed, so 12 in total) is uploaded to Weights and Biases. Example LightningCLI command: `python trainer.py fit --trainer.max_epochs=20 --data.data_path=data/32VLM --trainer.logger=WandbLogger --trainer.logger.project=clay --trainer.logger.save_dir=checkpoints --trainer.callbacks+=LogMAEReconstructedImage`.

Image processing in Python!

Enhance low contrast images by applying a histogram equalization stretching algorithm on the RGB images, instead of dividing by a magic number like 6000.

More samples to look at! Also only running einsum conversion on as many samples as needed rather than the whole batch, and handling cases where num_samples may be more than the batch_size.

Allows for `from src.callback_wandb import LogMAEReconstruction` to run, even without wandb being installed. Helpful if someone doesn't want to install wandb for whatever reason.

Testing that the LogMAEReconstruction callback works to save a set of images to WandB. Testing this in offline mode only, with checks that artifacts are saved locally, and that the wandb images have the correct caption and format.

Order of the folders could change, so using set instead of list.

Unsure why the unit test passes on GitHub Actions, but causes an `Error: Process completed with exit code 255`. Maybe a smaller batch size would help?

Turn off stdout / stderr logging by setting WANDB_CONSOLE=off to see if it helps with the failing GitHub Actions.

Another attempt to see if it helps prevent exit code 255 on GitHub Actions.

Running out of ideas on why pytest has exit code 255 on GitHub Actions...

Trying to figure out what's going on.

Setting WANDB_MODE="disabled", so no files are logged to disk, though the wandb.Image(s) are still created. See if this helps to resolve the exit code 255 issue on GitHub Actions.

Minor changes to the docstring of the on_validation_batch_end method, and a typo fix.

weiji14

Sunk wayyy too much time trying to debug why the unit test was failing on GitHub Actions but not locally (see below), so didn't get to do the SAR and DEM plots. Will implement those in a follow-up PR instead.

weiji14 · 2023-12-14T05:02:10Z

src/tests/test_callbacks.py

+        # Check that wandb saved some log files to the temporary directory
+        # assert os.path.exists(path := f"{tmpdirname}/wandb/latest-run/")
+        # assert set(os.listdir(path=path)) == set(
+        #     [
+        #         f"run-{trainer.logger.version}.wandb",
+        #         "tmp",
+        #         "files",
+        #         "logs",
+        #     ]
+        # )


Setting WANDB_MODE="disabled" in this unit test and commenting these lines that check for log files being saved, because GitHub Actions keeps failing with an error like Error: Process completed with exit code 255, even though all the unit tests pass on pytest 😕 The unit test does work locally though when I uncomment this block and use WANDB_MODE="offline", so not sure what's going on.

weiji14 · 2023-12-15T01:50:02Z

Gonna merge this in first and combine with the other wandb callback being developed at #47.

…88) * ➕ Add wandb A CLI and library for interacting with the Weights and Biases API! * 🔊 Log Masked Autoencoder reconstructions to WandB Created a custom callback function to log visualizations of the input and output images to the Masked Autoencoder. Only showing the RGB bands of Sentinel-2 for now. A sample of 6 image pairs (original + reconstructed, so 12 in total) is uploaded to Weights and Biases. Example LightningCLI command: `python trainer.py fit --trainer.max_epochs=20 --data.data_path=data/32VLM --trainer.logger=WandbLogger --trainer.logger.project=clay --trainer.logger.save_dir=checkpoints --trainer.callbacks+=LogMAEReconstructedImage`. * ➕ Add scikit-image Image processing in Python! * 📸 Apply histogram equalization to RGB images Enhance low contrast images by applying a histogram equalization stretching algorithm on the RGB images, instead of dividing by a magic number like 6000. * 🔧 Increase default sample size from 6 to 8 More samples to look at! Also only running einsum conversion on as many samples as needed rather than the whole batch, and handling cases where num_samples may be more than the batch_size. * 🧑‍💻 Make wandb a somewhat optional dependency Allows for `from src.callback_wandb import LogMAEReconstruction` to run, even without wandb being installed. Helpful if someone doesn't want to install wandb for whatever reason. * ✅ Add unit test for LogMAEReconstruction Testing that the LogMAEReconstruction callback works to save a set of images to WandB. Testing this in offline mode only, with checks that artifacts are saved locally, and that the wandb images have the correct caption and format. * 🐛 Compare expected folders using set instead of list Order of the folders could change, so using set instead of list. * 🧪 Prevent WandB logger from saving logs to local drive for now Setting WANDB_MODE="disabled", so no files are logged to disk, though the wandb.Image(s) are still created. See if this helps to resolve the exit code 255 issue on GitHub Actions. * 📝 Fix a typo and improve docstring Minor changes to the docstring of the on_validation_batch_end method, and a typo fix.

weiji14 added 2 commits December 13, 2023 10:14

➕ Add wandb

b25c71a

A CLI and library for interacting with the Weights and Biases API!

weiji14 self-assigned this Dec 13, 2023

weiji14 added 6 commits December 14, 2023 12:39

➕ Add scikit-image

38f8118

Image processing in Python!

📸 Apply histogram equalization to RGB images

6b59672

Enhance low contrast images by applying a histogram equalization stretching algorithm on the RGB images, instead of dividing by a magic number like 6000.

🔧 Increase default sample size from 6 to 8

7b9f066

More samples to look at! Also only running einsum conversion on as many samples as needed rather than the whole batch, and handling cases where num_samples may be more than the batch_size.

🧑‍💻 Make wandb a somewhat optional dependency

c464807

Allows for `from src.callback_wandb import LogMAEReconstruction` to run, even without wandb being installed. Helpful if someone doesn't want to install wandb for whatever reason.

✅ Add unit test for LogMAEReconstruction

17c2055

Testing that the LogMAEReconstruction callback works to save a set of images to WandB. Testing this in offline mode only, with checks that artifacts are saved locally, and that the wandb images have the correct caption and format.

🐛 Compare expected folders using set instead of list

2e2f895

Order of the folders could change, so using set instead of list.

weiji14 marked this pull request as ready for review December 14, 2023 02:18

weiji14 added 8 commits December 14, 2023 15:38

🧪 Try smaller batch size to debug flaky test

38d0c70

Unsure why the unit test passes on GitHub Actions, but causes an `Error: Process completed with exit code 255`. Maybe a smaller batch size would help?

🧪 Try disabling console logging to stdout/stderr

33e53fa

Turn off stdout / stderr logging by setting WANDB_CONSOLE=off to see if it helps with the failing GitHub Actions.

🧪 Try putting check for wandb_images outside of with statement

31e825e

Another attempt to see if it helps prevent exit code 255 on GitHub Actions.

🧪 Try deleting trainer and pl_module after running callback

16515a0

Running out of ideas on why pytest has exit code 255 on GitHub Actions...

🥅 Set PYTEST_DEBUG=1 environment variable

2b75a0a

Trying to figure out what's going on.

🧪 Remove monkeypatch and pytest debug

f838965

🧪 Prevent WandB logger from saving logs to local drive for now

ff8b86f

Setting WANDB_MODE="disabled", so no files are logged to disk, though the wandb.Image(s) are still created. See if this helps to resolve the exit code 255 issue on GitHub Actions.

📝 Fix a typo and improve docstring

b5afc1c

Minor changes to the docstring of the on_validation_batch_end method, and a typo fix.

weiji14 commented Dec 14, 2023

View reviewed changes

weiji14 mentioned this pull request Dec 14, 2023

Implement MAE with support for position, time, latlon & channel embeddings #47

Merged

3 tasks

weiji14 merged commit e259165 into main Dec 15, 2023
1 check passed

weiji14 deleted the callbacks/wandb branch December 15, 2023 01:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Callback function to log Masked Autoencoder reconstructions to WandB #88

Callback function to log Masked Autoencoder reconstructions to WandB #88

weiji14 commented Dec 13, 2023 •

edited

Loading

weiji14 left a comment

weiji14 Dec 14, 2023

weiji14 commented Dec 15, 2023

Callback function to log Masked Autoencoder reconstructions to WandB #88

Callback function to log Masked Autoencoder reconstructions to WandB #88

Conversation

weiji14 commented Dec 13, 2023 • edited Loading

weiji14 left a comment

Choose a reason for hiding this comment

weiji14 Dec 14, 2023

Choose a reason for hiding this comment

weiji14 commented Dec 15, 2023

weiji14 commented Dec 13, 2023 •

edited

Loading