Skip to content

Releases: albumentations-team/albumentations

Albumentations 1.4.23 Release Notes

17 Dec 21:35
Choose a tag to compare
  • Support Our Work
  • Core
  • Transforms
  • Bugfixes

Support Our Work

  1. Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
  2. Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
  3. Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.


Target images as numpy array

Now supports numpy arrays with shape (num_images, height, width, num_channels) or (num_images, height, width) as images in Compose

  • Ideal for video processing applications
  • Same transform applies to all images in the array

New 3D Data Support

  • volume: (depth, height, width) or (depth, height, width, num_channels)
  • mask3d: (depth, height, width) or (depth, height, width, num_channels)
  • volumes: (num_volumes, depth, height, width) for batch processing
  • masks3d: (num_volumes, depth, height, width) for batch processing
volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']


Added 3D transforms by @ternaus

Padding & Cropping

  • Pad3D: Pad 3D volumes with flexible padding options
  • PadIfNeeded3D: Conditional padding to meet minimum dimensions or divisibility requirements
  • CenterCrop3D: Center cropping for 3D volumes
  • RandomCrop3D: Random cropping of 3D volumes
transform = A.Compose([
    # Crop volume to a fixed size for memory efficiency
    A.RandomCrop3D(size=(64, 128, 128), p=1.0),    
    # Randomly remove cubic regions to simulate occlusions
        num_holes_range=(2, 6),
        hole_depth_range=(0.1, 0.3),
        hole_height_range=(0.1, 0.3),
        hole_width_range=(0.1, 0.3),

volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']


  • CoarseDropout3D: Random cuboid dropout regions for occlusion simulation
  • CubicSymmetry: 48 possible cube symmetry transformations (24 rotations + 24 rotoreflections)


Albumentations 1.4.22 Release Notes

06 Dec 21:42
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Core
  • Bugfixes

Support Our Work

  1. Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
  2. Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
  3. Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.


Elastic Transform

  1. Added argument noise_distribution that allows sampling displacement fields from gaussian and from uniform distributions.
  2. Deprecated parameters border_mode, value, mask_value - you can specify them, but will not have any effect.

New transform ShotNoise

Screenshot 2024-12-06 at 10 34 34
Apply shot noise to the image by modeling photon counting as a Poisson process.

    Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light.
    When photons hit an imaging sensor, they arrive at random times following Poisson statistics.
    This transform simulates this physical process in linear light space by:
    1. Converting to linear space (removing gamma)
    2. Treating each pixel value as an expected photon count
    3. Sampling actual photon counts from a Poisson distribution
    4. Converting back to display space (reapplying gamma)

    The noise characteristics follow real camera behavior:
    - Noise variance equals signal mean in linear space (Poisson statistics)
    - Brighter regions have more absolute noise but less relative noise
    - Darker regions have less absolute noise but more relative noise
    - Noise is generated independently for each pixel and color channel


Addes support for bounding boxes

Screenshot 2024-12-06 at 10 38 44


Added an option to inpaint holes using inpaint_ns and inpaint_telea from OpenCV


Added an option to inpaint holes using inpaint_ns and inpaint_telea from OpenCV


Added an option to inpaint holes using inpaint_ns and inpaint_telea from OpenCV


Added an option to inpaint holes using inpaint_ns and inpaint_telea from OpenCV

New transform TimeReverse

Added NewTransform TimeReverse

Reverse the time axis of a spectrogram image, also known as time inversion.

    Time inversion of a spectrogram is analogous to the random flip of an image,
    an augmentation technique widely used in the visual domain. This can be relevant
    in the context of audio classification tasks when working with spectrograms.
    The technique was successfully applied in the AudioCLIP paper, which extended
    CLIP to handle image, text, and audio inputs.

    This transform is implemented as a subclass of HorizontalFlip since reversing
    time in a spectrogram is equivalent to flipping the image horizontally.

New transform TimeMasking

Added NewTransform TimeMasking

Apply masking to a spectrogram in the time domain.

    This transform masks random segments along the time axis of a spectrogram,
    implementing the time masking technique proposed in the SpecAugment paper.
    Time masking helps in training models to be robust against temporal variations
    and missing information in audio signals.

    This is a specialized version of XYMasking configured for time masking only.
    For more advanced use cases (e.g., multiple masks, frequency masking, or custom
    fill values), consider using XYMasking directly.

New transform FrequencyMasking

Apply masking to a spectrogram in the frequency domain.

    This transform masks random segments along the frequency axis of a spectrogram,
    implementing the frequency masking technique proposed in the SpecAugment paper.
    Frequency masking helps in training models to be robust against frequency variations
    and missing spectral information in audio signals.

    This is a specialized version of XYMasking configured for frequency masking only.
    For more advanced use cases (e.g., multiple masks, time masking, or custom
    fill values), consider using XYMasking directly.

Added NewTransform FrequencyMasking

It is a specialized version of XYMasking that has the similar API as FrequencyMasking from torchaudio

New Transform Pad

Screenshot 2024-12-06 at 11 19 42
Pad the sides of an image by specified number of pixels.

        padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:
            * int - pad all sides by this value
            * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y
            * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side

This is the generalization of the torchvision transform with the same name

New Transform Erasing

Screenshot 2024-12-06 at 11 23 25

This is the generalization of the similar torchvision transform

Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.

    This augmentation helps improve model robustness by randomly masking out rectangular regions in the image,
    simulating occlusions and encouraging the model to learn from partial information. It's particularly
    effective for image classification and person re-identification tasks.

New Transform AdditiveNoise

Screenshot 2024-12-06 at 11 26 17
Apply random noise to image channels using various noise distributions.

    This transform generates noise using different probability distributions and applies it
    to image channels. The noise can be generated in three spatial modes and supports
    multiple noise distributions, each with configurable parameters.

        noise_type: Type of noise distribution to use. Options:
            - "uniform": Uniform distribution, good for simple random perturbations
            - "gaussian": Normal distribution, models natural random processes
            - "laplace": Similar to Gaussian but with heavier tails, good for outliers
            - "beta": Flexible bounded distribution, can be symmetric or skewed

        spatial_mode: How to generate and apply the noise. Options:
            - "constant": One noise value per channel, fastest
            - "per_pixel": Independent noise value for each pixel and channel, slowest
            - "shared": One noise map shared across all channels, medium speed


Added 'gaussian' method for image sharpening.

New transform SaltAndPepper

Screenshot 2024-12-06 at 11 52 54
Apply salt and pepper noise to the input image.

    Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt)
    or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.

New transform PlasmaBrightNessContrast

Screenshot 2024-12-06 at 11 54 34
Apply plasma fractal pattern to modify image brightness and contrast.

    This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns
    that are then used to create spatially-varying brightness and contrast adjustments.
    The result is a natural-looking, non-uniform modification of the image.

New Transform PlasmaShadow

<img width="118...

Read more

Albumentations 1.4.21 Release Notes

01 Nov 00:15
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Core
  • Benchmark
  • Speedups

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server


Auto padding in crops

Added option to pad the image if crop size is larger than the crop size

Old way

A.PadIfNeeded(min_height=1024, min_width=1024, p=1),
A.RandomCrop(height=1204, width=1024, p=1)

New way:

A.RandomCrop(height=1204, width=1024, p=1, pad_if_needed=True)

Works for:

You may also use it to pad image to a desired size.


Random state

Now random state for the pipeline does not depend on the global random state



transform = A.Compose(...) 


transform = A.Compose(seed=seed, ...)


transform = A.Compose(...)

Saving used parameters

Now you can get exact parameters that were used in the pipeline on a given sample with

transform = A.Compose(save_applied_params=True, ...)

result = transform(image=image, bboxes=bboxes, mask=mask, keypoints=keypoints)



Moved benchmark to a separate repo

Current result for uint8 images:

Transform albumentations
HorizontalFlip 8325 ± 955 4807 ± 818 6042 ± 788 390 ± 106 914 ± 67
VerticalFlip 20493 ± 1134 9153 ± 1291 10931 ± 1844 1212 ± 402 3198 ± 200
Rotate 1272 ± 12 1119 ± 41 1136 ± 218 143 ± 11 181 ± 11
Affine 967 ± 3 - 774 ± 97 147 ± 9 130 ± 12
Equalize 961 ± 4 - 581 ± 54 152 ± 19 479 ± 12
RandomCrop80 118946 ± 741 25272 ± 1822 11503 ± 441 1510 ± 230 32109 ± 1241
ShiftRGB 1873 ± 252 - 1582 ± 65 - -
Resize 2365 ± 153 611 ± 78 1806 ± 63 232 ± 24 195 ± 4
RandomGamma 8608 ± 220 - 2318 ± 269 108 ± 13 -
Grayscale 3050 ± 597 2720 ± 932 1681 ± 156 289 ± 75 1838 ± 130
RandomPerspective 410 ± 20 - 554 ± 22 86 ± 11 96 ± 5
GaussianBlur 1734 ± 204 242 ± 4 1090 ± 65 176 ± 18 79 ± 3
MedianBlur 862 ± 30 - 813 ± 30 5 ± 0 -
MotionBlur 2975 ± 52 - 612 ± 18 73 ± 2 -
Posterize 5214 ± 101 - 2097 ± 68 430 ± 49 3196 ± 185
JpegCompression 845 ± 61 778 ± 5 459 ± 35 71 ± 3 625 ± 17
GaussianNoise 147 ± 10 67 ± 2 206 ± 11 75 ± 1 -
Elastic 171 ± 15 - 235 ± 20 1 ± 0 2 ± 0
Clahe 423 ± 10 - 335 ± 43 94 ± 9 -
CoarseDropout 11288 ± 609 - 671 ± 38 536 ± 87 -
Blur 4816 ± 59 246 ± 3 3807 ± 325 - -
ColorJitter 536 ± 41 255 ± 13 - 55 ± 18 46 ± 2
Brightness 4443 ± 84 1163 ± 86 - 472 ± 101 429 ± 20
Contrast 4398 ± 143 736 ± 79 - 425 ± 52 335 ± 35
RandomResizedCrop 2952 ± 24 - - 287 ± 58 511 ± 10
Normalize 1016 ± 84 - - 626 ± 40 519 ± 12
PlankianJitter 1844 ± 208 - - 813 ± 211 -


Albumentations 1.4.20 Release Notes

24 Oct 23:45
Choose a tag to compare

Hotfix version.

Albumentations 1.4.19 Release Notes

23 Oct 20:36
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Core
  • Bug Fixes

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server


Added mask_interpolation to all transforms that use mask interpolation, including:

by @ternaus


  • Minimal supported python version is 3.9
  • Removed dependency on scikit-image
  • Updated Random number generator from np.random.state to np.random.generator. Second is 50% faster => speedups in all transforms that heavily use random generator
  • Where possible moved from cv2.LUT to stringzilla lut
  • Added parameter mask_interpolation to Compose that overrides mask interpolation value in all transforms in that Compose, now can use more accurate cv2.INTER_NEAREST_EXACT for semantic segmentation and can work with depth and heatmap estimation using cubic, area, linear, etc


Albumentations 1.4.18 Release Notes

08 Oct 22:32
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Core
  • Deprecations
  • Bugfixes

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server



Screenshot 2024-10-08 at 15 06 03

Added support for keypoints


Screenshot 2024-10-08 at 15 08 04

Added support for keypoints and bounding boxes


Screenshot 2024-10-08 at 15 10 24

Added support for keypoints and bounding boxes


Screenshot 2024-10-08 at 15 11 53

Added support for keypoints and bounding boxes


Screenshot 2024-10-08 at 15 13 36

Added support for bounding boxes and keypoints


Screenshot 2024-10-08 at 15 18 23

Added support for keypoints


Screenshot 2024-10-08 at 15 19 46

Added support for keypoints and bonding boxes


Screenshot 2024-10-08 at 15 21 52

Added support for bounding boxes and keypoints


Added support for masks as numpy arrays of the shape (num_masks, height, width)

Now you can apply transforms to masks as:

masks = <numpy array with shape (num_masks, height, width)>

transform(image=image, masks=masks)


Removed MixUp as it was doing almost exactly the same as TemplateTransform


  • Bugfix in RandomFog
  • Bugfix in PlankianJitter
  • Several people reported issue with masks as list of numpy arrays, I guess it was fixed as a part of some other work as I cannot reproduce it. Just in case added tests for that case.

Albumentations 1.4.17 Release Notes

30 Sep 22:33
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Core

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server



  1. Added Bounding Box support
  2. remove_invisible=False keeps keypoints
Screenshot 2024-09-30 at 15 25 53

by @ternaus


Added support for keypoints

Screenshot 2024-09-30 at 15 29 36

by @ternaus


Added RandomOrder Compose

Select N transforms to apply. Selected transforms will be called in random order with force_apply=True. 
Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights. 
This transform is like SomeOf, but transforms are called with random order. 
It will not replay random order in ReplayCompose.

Albumentations 1.4.16 Release Notes

22 Sep 20:42
Choose a tag to compare
  • Support Our Work
  • UI Tool
  • Transforms
  • Improvements and Bug Fixes

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server

UI Tool

For visual debug wrote a tool that allows visually inspect effects of augmentations on the image.

You can find it at

  • Works for all ImageOnly transforms
  • Authorized users can upload their own images

it is work in progress. It is not stable and polished yet, but if you have feedback or proposals - just write in the Discord Server mentioned above.


  • Updated and extended docstrings in all ImageOnly transforms.
  • All ImageOnly transforms support both uint8 and float32 inputs


Added texture method to RandomSnow

Screenshot 2024-09-14 at 19 09 52


Added physics_based method to RandomSunFlare
Screenshot 2024-09-14 at 19 10 41

Bugfixes and improvements

  • Bugfix in albucore dependency. Now every Albumnetations version is tailored to a specific albucore version. Added pre-commit hook to automatically check it on every commit.
  • BugFix in TextImage transform, after rewriting bbox processing in a vectorized form, transform was failing.
  • As a part of the work to remove scikit-image dependency @momincks rewrote bbox_affine in a plain numpy
  • Bugfix. It was unexpected, but people use bounding bboxes that are less than 1 pixel. Removed constrant on a minimum bounding box being 1x1
  • Bugfix in bounding box filtering. Now if all bounding boxes were filtered return not empty array, but empty array of shape (0, 4)

Albumentations 1.4.15 Release Notes

13 Sep 01:54
Choose a tag to compare
  • Support Our Work
  • UI Tool
  • Core
  • Transforms
  • Improvements and Bug Fixes

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server

UI Tool

For visual debug wrote a tool that allows visually inspect effects of augmentations on the image.

You can find it at

RIght now supports only ImageOnly transforms, and not all but a subset of them.

it is work in progress. It is not stable and polished yet, but if you have feedback or proposals - just write in the Discord Server mentioned above.

Screenshot 2024-09-12 at 18 11 49


Bounding box and keypoint processing was vectorized

  • You can pass numpy array to compose and not only list of lists.
  • Presumably transforms will work faster, but did not benchmark.



  • Reflection padding correctly works In Affine and ShiftScaleRotate


  • Added support for float32 images


  • Added support for float32 images


  • Added support for float32 images
  • Added support for any number of channels


  • Added support for float32
  • Added support for anyu number of channels


Still works, but deprecated. It was a very strange transform, I cannot find use case, where you needed to use it.

It was equivalent to:

OneOf([Transpose, VerticalFlip, HorizontalFlip]) 

Most likely if you needed transform that does not create artifacts, you should look at:

  • Natural images => HorizontalFlip (Symmetry group has 2 elements, meaning will effectively increase your dataset 2x)
  • Images that look natural when you vertically flip them => VerticalFlip (Symmetry group has 2 elements, meaning will effectively increase your dataset 2x)
  • Images that need to preserve parity, for example texts, but we may expect rotated documents => RandomRotate90 (Symmetry group has 2 elements, meaning will effectively increase your dataset 4x)
  • Images that you can flip and rotate as you wish => D4 (Symmetry group has 8 elements, meaning will effectively increase your dataset 8x)


Now you can define the number of output channels in the resulting gray image. All channels will be the same.

Extended ways one can get grayscale image. Most of them can work with any number of channels as input

  • weighted_average: Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B)
    Works only with 3-channel images. Provides realistic results based on human perception.
  • from_lab: Extracts the L channel from the LAB color space.
    Works only with 3-channel images. Gives perceptually uniform results.
  • desaturation: Averages the maximum and minimum values across channels.
    Works with any number of channels. Fast but may not preserve perceived brightness well.
  • average: Simple average of all channels.
    Works with any number of channels. Fast but may not give realistic results.
  • max: Takes the maximum value across all channels.
    Works with any number of channels. Tends to produce brighter results.
  • pca: Applies Principal Component Analysis to reduce channels.
    Works with any number of channels. Can preserve more information but is computationally intensive.


Now uses Affine under the hood.

Improvements and Bug Fixes

  • Bugfix in GridElasticDeform by @4pygmalion
  • Speedups in to_float and from_float
  • Bugfix in PadIfNeeded. Did not work when empty bounding boxes were passed.

Albumentations 1.4.14 Release Notes

16 Aug 00:22
Choose a tag to compare
  • Support Our Work
  • Transforms
  • Improvements and Bug Fixes

Support Our Work

  1. Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
  2. Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
  3. Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server


Added GridElasticDeform transform


Grid-based Elastic deformation Albumentation implementation

This class applies elastic transformations using a grid-based approach.
The granularity and intensity of the distortions can be controlled using
the dimensions of the overlaying distortion grid and the magnitude parameter.
Larger grid sizes result in finer, less severe distortions.

    num_grid_xy (tuple[int, int]): Number of grid cells along the width and height.
        Specified as (grid_width, grid_height). Each value must be greater than 1.
    magnitude (int): Maximum pixel-wise displacement for distortion. Must be greater than 0.
    interpolation (int): Interpolation method to be used for the image transformation.
        Default: cv2.INTER_LINEAR
    mask_interpolation (int): Interpolation method to be used for mask transformation.
        Default: cv2.INTER_NEAREST
    p (float): Probability of applying the transform. Default: 1.0.

    image, mask

Image types:
    uint8, float32

    >>> transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)
    >>> result = transform(image=image, mask=mask)
    >>> transformed_image, transformed_mask = result['image'], result['mask']

    This transformation is particularly useful for data augmentation in medical imaging
    and other domains where elastic deformations can simulate realistic variations.

by @4pygmalion


Now reflection padding correctly with bounding boxes and keypoints

by @ternaus


  • Works with any number of channels
  • Intensity of the shadow is not hardcoded constant anymore but could be sampled
Simulates shadows for the image by reducing the brightness of the image in shadow regions.

    shadow_roi (tuple): region of the image where shadows
        will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1].
    num_shadows_limit (tuple): Lower and upper limits for the possible number of shadows.
        Default: (1, 2).
    shadow_dimension (int): number of edges in the shadow polygons. Default: 5.
    shadow_intensity_range (tuple): Range for the shadow intensity.
        Should be two float values between 0 and 1. Default: (0.5, 0.5).
    p (float): probability of applying the transform. Default: 0.5.


Image types:
    uint8, float32


by @JonasKlotz

Improvements and Bug Fixes

  • BugFix in Affine. Now fit_output=True works correctly with bounding boxes. by @ternaus
  • BugFix in ColorJitter. By @maremun
  • Speedup in CoarseDropout. By @thomaoc1
  • Check for updates does not use logger anymore. by @ternaus
  • Bugfix in HistorgramMatching. Before it output array of ones. Now works as expected. by @ternaus