Skip to content

v1.0.0

Compare
Choose a tag to compare
@beniz beniz released this 06 Oct 10:20
· 318 commits to master since this release

joliGEN: Generative AI Toolset (Changelog)

1.0.0 (2023-10-06)

Docker

Features

  • add a server endpoint to delete files (30b2143)
  • add choices for all options (ed43b82)
  • add ddim inference (0196134)
  • add DDPM tutorial on the VITON-HD dataset (c932d73)
  • add FastAPI server to run training (f517462)
  • add lambda for semantic losses (aab53fe)
  • add LPIPS metric (f1e0526)
  • add miou compute to tests (c0033ef)
  • add new metrics (f3c84cd)
  • add palette model (b7db294)
  • add psnr metric (7135458)
  • add sampling options to test (a2958dc)
  • add SRC and hDCE losses (ddfcc97)
  • add test for doc generation (41526f8)
  • add test on cycle_gan_semantic_mask (3eeff76)
  • add tests for reference image dataloaders (ae6405e)
  • added D noise to CUT with semantics (31aa4a3)
  • added optimizers and options (505cac2)
  • allow control of projected discriminator interpolation (dbffec5)
  • allow ViT custom resolution at D projector init (82e6e83)
  • api: display current commit at startup (6f90be8)
  • aug: affine transforms for semantics (170b0f8)
  • aug: configurable online mask delta augmentation by x and y axis (dfa6459)
  • aug: select bbox category through the path sanitization functionality (a8d3f48)
  • auto download segformer weights (083cc5e)
  • backward while computing MSE criterion loss (1b87906)
  • bbox as sam prompt (a39c5bd)
  • bbox prompt for sam (1fa9cae)
  • bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
  • bw model export (8e43efa)
  • check code format when PR (eeb56cb)
  • choices for canny random thresholds (9573fc1)
  • class weights for semantic segmentation network with cross entropy (4274f1e)
  • classifier training on domain B (fa343c0)
  • commandline saving (6eb503e)
  • commandline script for joligan server calls (48ae23b)
  • compute_feats for unet G (9f1109e)
  • conditioning for palette (b9854ee)
  • config json for client script (174dce9)
  • context for D (b0d3c7b)
  • contrastive classifier noise (7193e0e)
  • contrastive loss for D (deb2ec4)
  • cut_semantic model (b20a943)
  • D accuracy (26ead91)
  • data: random noise in images for object insertion (42cf13d)
  • DDP (68f24da)
  • deceiving D for GAN training (2e2113f)
  • depth model as projector (10ffc28)
  • depth prediction and depth discriminator (01bc62b)
  • diff augment (054509c)
  • diffusion inference with old and new models (9c4c5a9)
  • display augmented images (2126253)
  • display test images (a1de083)
  • doc options auto update (1b08f92)
  • doc: add JSON config examples (5332213)
  • doc: basic server REST API (a757d17)
  • doc: datasets (dfe2343)
  • doc: DDPM conditioning training and inference examples (e694a29)
  • doc: models (be1fe34)
  • doc: refactored README with links to documentation (b5bf121)
  • doc: reference image conditioning (70aeb32)
  • doc: remove overview (2360527)
  • doc: server, client, docker (68a5b96)
  • doc: tips (3fea9ca)
  • doc: training (a4b720d)
  • doc: update inference models and examples (3c43a7b)
  • doc: updated FAQ (88b417c)
  • doc: updated model export (e692f78)
  • edge detection techniques (78202ea)
  • export for unet_mha (b4c3cfd)
  • extract bbox from img (fb64ef0)
  • first recut model (aaa4069)
  • first test (4ac8cd9)
  • fixed bbox size for online creation and bbox size randomization (5cd6227)
  • G weights export during traing (4c045e6)
  • generic image augmentation (d2ceb81)
  • get_schema uses default instead of choices (not always available) (a779b2a)
  • global models (3819a40)
  • inverted mask for automatic background inpainting (39f9ff2)
  • itersize option for cut_model and cut_semantic_mask_model (b3b9e7a)
  • list all available models in help (06a6259)
  • load segformer torchscript weights (672b341)
  • loss values saved in json file and file to display it later (d0fa9a5)
  • madgrad optimizer (1c410f2)
  • metrics for testset (c875f2b)
  • miou compute for f_s pred (5851566)
  • ml: ability to train in wavelet space, similar to 2102.06108 (28077d4)
  • ml: added semantic threshold option (03f33a2)
  • ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d)
  • ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07)
  • ml: classifier-free guidance for image-to-image DDPM (a6425ef)
  • ml: controle over image generation with diffusion in-painting model (0a5ed86)
  • ml: DDIM scheduler (443e7d7)
  • ml: diffusion super-resolution for palette (66c591f)
  • ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b)
  • ml: Discriminator based on SAM (e38d501)
  • ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e)
  • ml: efficient UNet / 2205.11487 (d690ab4)
  • ml: exponential moving average for G (ad33796)
  • ml: GAN mask generator with sam refined target (0cd1ee9)
  • ml: grayscale support for unaligned with semantics (a79548f)
  • ml: initial sam generated masks for unaligned datasets (5c64440)
  • ml: ittr generator control over the number of blocks (ea7390e)
  • ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e)
  • ml: L1 loss for palette diffusion model (d2bb2a5)
  • ml: Lion optimizer (4b99204)
  • ml: mask conditioning for palette (8e71bb7)
  • ml: mask generation across domains for GAN semantics (cf4890b)
  • ml: optim weigth decay parameter control (34fb2dd)
  • ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44)
  • ml: torchvision model support for semantic classification (341205c)
  • ml: UNet/UViT resnet block finer control (9bb979c)
  • ml: UViT for GAN (b1b3607)
  • ml: ViT-14L as projected discriminator (dab27d2)
  • mobile attention resnet (7b8fd87)
  • MobileSam implementation (ea11745)
  • MobileSam implementation (e0d3a67)
  • more options for image generation with diffusion (cd08411)
  • MSE identity loss for cut models (28bfbfd)
  • multi head attention unet generator (9e4c232)
  • multimodal generator with cLR-GAN strategy (c3919b9)
  • multiple D support (578f709)
  • multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44)
  • multiscale loss for diff (0af1a80)
  • nb of images used for FID computation can be chosen (233634b)
  • online dataset creation (9883eac)
  • option for number of inferences (palette model) (7e8bcc9)
  • option to choose metrics to compute (ab1074a)
  • option to choose norm in unet_mha (77d0161)
  • option to compute semantic G loss on f_s output (9ef922d)
  • option to display real image - fake image (b8a134a)
  • option to go through resnet blocks twice (backward compatibility) (1b3330b)
  • option to select embedding network for ref conditioning (2c9933d)
  • option to use a different f_s for domain B (6c55c26)
  • options for aspect ratio and augmentation of cond image (7bf9c95)
  • options.rst gen script (e13a9d8)
  • precommit black formatting (ef0c883)
  • pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a)
  • previous frame as y cond in palette (99de095)
  • projected discriminator (92c0aa0)
  • quickstart (0aa50e4)
  • quickstart update ddpm (98cef11)
  • random perspective and bilinear interpolation with diffaug (d2633c8)
  • readthedocs initial commit (2a57b6f)
  • reference image conditioning (8618129)
  • rename cur_mask (53566c6)
  • reorganizing display of images (bed57bd)
  • resnet G for diffusion (6d1cddc)
  • resnet_attn for cut model (7d01972)
  • sam for mask refinement and edges (7e55165)
  • sanitized paths can be saved and loaded (2f9f42f)
  • save latest images (1f5fb69)
  • saving options json file (a9452ce)
  • script to compute metrics (f5ab623)
  • script to generate diffusion video (574e149)
  • script to remove useless models weights (543f280)
  • scripts to export generator to JIT and to transform a single image (f3f5c37)
  • second D for all models (7ec82be)
  • segformer can be used for attention masks generation (98a3d6e)
  • semantic loss on identity (91df963)
  • server: add commit number in swagger documentation (3c96bc0)
  • server: add synchronous training (5e3291e)
  • server: group options into categories (9358007)
  • set mininum context via crop/bbox ratio (094b213)
  • single bbox per crop for inpainting while training (ac2b4d6)
  • sphinx rtd theme initial (6b2ac14)
  • stylegan network from CUT repo (9e36c55)
  • support for amp on forward ops (dffbb36)
  • support TensorRT generator models in inference with DeepDetect (980ad49)
  • sync loss only when printing (ff8de5e)
  • temporal criterion (c0dbf78)
  • temporal discriminator (154da72)
  • temp: reduce image size during test (7a7e4a3)
  • tests with different f_s net (8d99c72)
  • tf32 support (with Ampere GPUs) (c65622c)
  • timm models as feature extractor for projected D (498a11c)
  • use label 0 in domain B for object removal (d3e2399)
  • use the trained discriminator to rank generated images (65d6416)
  • use torchvision.transforms for differentiable augmentations (dc46733)
  • UViT for diffusion (b427fa8)
  • working inference tutorials (23c8ab6)

Bug Fixes

  • 1 iteration = 1 image (0b096f8)
  • adam/radam beta1 default value (db0e2dd)
  • add clamping in tensor2im function (9537eac)
  • add einops to requirements_github_actions (b30895d)
  • add exec flag to run.sh (f6350d7)
  • add missing architectures to options doc (5c31e1f)
  • add missing test file (9e520cd)
  • add mmcv to requirements for doc generation (2c8ef04)
  • add mmseg to requirements for doc generation (9d83ffe)
  • add padding type to jit export (ae3e23c)
  • add phase to dataloaders arg (3992136)
  • add reflectionpad to fix resnet (e702930)
  • add tqdm to requirements (6e34f72)
  • add type for nb_mask options (7bce61c)
  • add vision aided to doc auto requirements (5fbb13f)
  • add wget to github action requirements (ad7b3d7)
  • all_classes_as_one option (df01988)
  • allow for domain B to have no semantic labels (42604aa)
  • allow for image reading failures (d7f5d08)
  • allow to freeze networks with DDP (d12bd7e)
  • alternate optimizing G and E (d76ba1b)
  • APA_img is only added to visual names APA option is true (309592e)
  • attention masks display (cc5b6b8)
  • auto resize of attention masks (e3bf86a)
  • avoid lambda in RandomImgAug that prevents DDP (a206f91)
  • B label clamping override + display colormap (6fc1dd4)
  • batchnorm for single gpu (93b164c)
  • bbox and mask for diffusion generation script (390fe5d)
  • bbox ref idx from crop_image return in inference code (be0a1b5)
  • broken export onnx script (8080036)
  • broken test + bbox_ref_id option in diffusion single image generation script (179b3ca)
  • build docker file (c055385)
  • catch image reading errors (0153eb4)
  • circumvent isTrain in base options (2eb905e)
  • clean code (b83282b)
  • cls net was created with f_s options (70e9009)
  • cls_class_weights default to [] (ad719f6)
  • computation of out_mask loss when multigpu (31b7294)
  • compute_temporal_fake defined twice (83e0d2d)
  • cond inference (77c0444)
  • context pixel for sanitize paths (d18cee9)
  • correct urls in quickstart_gan.rst (580353f)
  • cpu device for single image generation script (bb3c70c)
  • criterion temporal loss display (48d4c27)
  • criterionIdt is not used by cut (3665410)
  • crop_image for non square images and large bboxes (afdde75)
  • cyclegan semantic mask failure with f_s (0688974)
  • D global requires grad were not set true before backward (6a55914)
  • D noise in cut base model (c6a2114)
  • D_global optimization (284d8ae)
  • data dept init (8a0707e)
  • dataaug_D_diffusion option (70628d1)
  • dataset created once, one dataloader per gpu (70348ff)
  • DDIM restoration when batch_size > 1 (ccb445b)
  • ddpm inference with sam masks + inverted masks (3055910)
  • default --cls_config_segformer value (caadf7a)
  • default nb_mask_input value (632a0f6)
  • dependencies to docker build file (76eb760)
  • diff aug options (a337990)
  • diffusion generation script is already multimodal (d49e66a)
  • diffusion inference with mask_in (dea27ac)
  • diffusion video generation fixes to generate API changes (2d74f7a)
  • diffusion with temporal data loader (c328acb)
  • display for palette model (2a98cf7)
  • doc_gen for palette model (32ef0fa)
  • doc: badly formatted title (8b26c87)
  • docker server source image (b29eb4f)
  • doc: missing losses (f74c373)
  • doc: missing models (a9bddd4)
  • doc: remove newlines and fix path (2f0aa5a)
  • doc: title format (f580d88)
  • export & inference scripts for new encoder/decoder architectures (7993544)
  • f_s backward when multigpu (f8b6067)
  • f_s zero_grad in cut_semantic_mask (0e1c2b3)
  • feats went through resnet blocks 2 times and rm old code (7d768b3)
  • fid for cut and for relative path (57dbabf)
  • flush print() in joligan_api.py (0880d85)
  • force RGB in online creation (15efa68)
  • force torch to stable 1.13.1 (7fffaf0)
  • gan_networks import for inference (09f2969)
  • generate doc script (01f25af)
  • generation script for non multimodal models (a2ff20b)
  • generation scripts help (d1102ff)
  • get schema for * options (d7e33eb)
  • get schema should not compute base_gan_model (5dd6662)
  • get_feats feature extractor for projected D (212751d)
  • guidance scale option for diffusion inference (846e4f5)
  • help command (727fc15)
  • help for diffusion inference script (d85fcc9)
  • image generation script with torch model (4b67c05)
  • image size control in generation script (01ad67d)
  • image size is unused in gen_single_image (b7ea7f4)
  • img visu for palette models (668b253)
  • imgaug option with no mask (f09d484)
  • import signal module (e0d1c52)
  • improved diffusion inference scripts, including video generation (485fad9)
  • in dataloader, warning when class > nclasses (e9231eb)
  • in place gradient error with f_s (283cb85)
  • inference beta_schedule location (ca910d8)
  • inference when no mask cond (e19505d)
  • inference_num for reference img (e444ba1)
  • input_B_label loading when available (83490b3)
  • int are considered as float if needed and warning if remaining keys in options (ad6d99e)
  • inverted BtoA direction labels, moving direction to datasets (4ac2845)
  • jenkins docker perms (ebad13f)
  • jit and onnx export scripts post refactor (32e0912)
  • joligan api launch training (5ea68a4)
  • lambda for GAN losses (020435a)
  • license year (2424de0)
  • loading image with transparency (3bcc7fd)
  • loss values averaged accross gpus and compute last step (f941595)
  • macs to flops (69a02db)
  • make aligned dataset work again (61741cc)
  • mask and class conditioning in generation script (38631b8)
  • mask clamping and display for diff models (4b24fe4)
  • mask display when more than one class (fa21b2e)
  • mask_delta checks (ef095f1)
  • mask_delta for inference diffusion (69bf8ef)
  • mask_delta ratio online check (341dd73)
  • missing get_weights for f_s (835ab4f)
  • missing help on projected D arch (c65f2d2)
  • missing input_nc in f_s segformer constructor (309f75f)
  • missing shape attribute in conditional (9e3d365)
  • ml: allow semantic loss weight control (8e552be)
  • ml: allow torchvision semantic model backbones to work with bw images (d112532)
  • ml: control of cls semantic classifier learning rate (be566c6)
  • ml: DDIM with reference image (f06e47c)
  • ml: diffusion schedule with generation script (4c26b09)
  • ml: semantic regression loss tensor dims (1ddedc6)
  • mobile netG option (4c3db79)
  • modify command line for palette model (2287054)
  • multigpu for cut model (ad95e67)
  • multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a)
  • multigpu for cut_semantic_mask model (4fe3ee2)
  • multimodal GAN requires retain_graph for now (7be1628)
  • nb attn variable after refactor (3e33fcc)
  • nce layers for ittr generator (916b475)
  • NCE with segformer architecture (c037912)
  • netF is initialized on the right gpu (7856bd9)
  • netF weights updating (a75544e)
  • network loading (dcb25d7)
  • networks groups inherit from other models (f988002)
  • new options names for unaligned_labeled_mask (6fed55c)
  • no need for retain_graph to be True on losses (b715638)
  • nvidia key rotation (eadd702)
  • online creation and selfsupervised (42dd435)
  • online creation for temporal dataloader (9cf3d0a)
  • online creation when cropsize > imgsize (c2fc34e)
  • online mask loading with multiple bboxes (1d7a6e1)
  • online mask padding (90d7444)
  • only load testset on gpu 0 (abcb4c4)
  • only one gpu visible in each process (4954717)
  • only onnx export fort segformer G (543ce28)
  • onnx export (e9d36a8)
  • option to not set device when parsing json config file (092b5f8)
  • option type and max int value (d7f60f2)
  • out mask images can be computed with more than 2 semantic classes (36d54b9)
  • output_display_type default value should be a list (d694d29)
  • palette inference without ref image (02d3f76)
  • projection interpolation at init (2aecf46)
  • proper categories titles in schema + improve option saving (bbc41c6)
  • python html lib override (be2b089)
  • random image aug class argument (2d8f686)
  • random offset missing in crop_image (1fc4bf4)
  • real image f_s pred name (d40109c)
  • remove environment.yml (783cae8)
  • remove eval mode for export (9bdae8d)
  • remove last batch norm to cls for batch size 1 (d7fe727)
  • remove use_resize from segformer forward (c71d4de)
  • requirements for github actions (6227c06)
  • requirements for github actions (4fe34ce)
  • resnet attn class call (17f907f)
  • resnet default number of blocks with diffusion (3d34368)
  • resnet with attention class name (950237d)
  • resnet_attn model name (eb4216e)
  • resnet_attn option (08b61ea)
  • reverse DDIM schedule (17ba96a)
  • round pixel gap before offset sampling (05e8959)
  • rst autodoc (5309cc2)
  • rst format and typos (f20f264)
  • sam compatible with mask inversion in palette (4294679)
  • sam inference as discriminator (5708832)
  • sample runs and default options (f08d132)
  • save generated bbox only when useful in diffusion inference (4b93cf3)
  • save metrics plots and allow resuming (095f040)
  • save model every epoch is default (51c5d34)
  • save networks img for diffusion (090df33)
  • saving paths_sanitized in checkpoints dir (49c5139)
  • script for image gen using diffusion (e909525)
  • segformer as f_s net (d55d63d)
  • segformer for semantics with optional partially pre-trained model (3078e51)
  • segformer G and segformer feature extractor (85171e1)
  • segformer ONNX export and image generation script (c5a700b)
  • selfsupervised dataloaders don't need domain B anymore (257a056)
  • sem compute for cls on identity (6a87858)
  • semantic losses was initialized twice (3fb85ab)
  • server launching in generate api doc (f459842)
  • server: wrong import (cb4087e)
  • smaller batchsize at end of epochs (c49ddd1)
  • smaller miou interval to trigger computing (76382d0)
  • softmax G semantic was applied twice (5c4c3ec)
  • start super-resolution restoration from noise (3bce4a9)
  • stylegan2 feature extraction (e442021)
  • support for no mask in B domain (6dd98d6)
  • sv network as latest even with train_save_by_iter (9016d3f)
  • temp: norm default (d95d10f)
  • temporal criterion loss compute (4ddce8b)
  • temporal D and context pixels compatibility (421f458)
  • temporal dataloader, with added path sanitize (6451db2)
  • temporal dataloading and loss computing (07706d3)
  • temporal discriminator with masks and bboxes (43015da)
  • temporal end of sequence (e08ffba)
  • temporal mse with GAN models (79b0d14)
  • test functions names (2247f18)
  • tests (fabd8b6)
  • tests for all f_s net (144d7fd)
  • tests with no cache (787f27b)
  • training examples with mask_delta (9b14339)
  • training examples with mask_delta (8d08839)
  • typo (4274784)
  • typo (f1dcff7)
  • typo in README.md (f5ef15e)
  • typos (5b061f2)
  • typos and class name (6462374)
  • unaligned data_dataset_mode in README (5e7fbd9)
  • underflow in custom cut l2norm, replaced by torch built-in (ff42852)
  • UNet/UViT layers for cut NCE (8459876)
  • unset index_B for unaligned mask offline datasets (8ba5e12)
  • update python version for pre-commit (900ab17)
  • Update README.md (cb9031d)
  • Update README.md (7e5fe82)
  • use only mask from selected bbox at inference time (dd4fe3b)
  • use signal handler to kill all the processes (fab93ef)
  • using cut with F as pure sampling function (887b9a5)
  • validation with no label, e.g. simple cut model (53984f4)
  • visdom autostart and no display (fbc9942)
  • visdom port for server launching (747d18a)
  • visuals for cyclegan (b7cabaa)
  • vitclip16 requires 224 input size (8b8bb67)
  • wrong D_noise option (eaea415)
  • wrong feature network in projected (3318c77)
  • wrong indentation (bac6ce2)
  • wrong number of model forward signature parameters check (868b55b)
  • wrong option for data sanitize (479242e)
  • wrong option name in test and add comment in json option loading (1a7abd6)
  • wrong option prefix (711fa7d)
  • wrong options (4298071)
  • wrong options for validation loading and fid computing (a9d2516)
  • wrong train_compute_fid option (893c0e2)
  • wrong values for bbox ref coordinates (dc2a00f)

2.0.0 (2023-10-06)

Features

  • add a server endpoint to delete files (30b2143)
  • add choices for all options (ed43b82)
  • add ddim inference (0196134)
  • add DDPM tutorial on the VITON-HD dataset (c932d73)
  • add FastAPI server to run training (f517462)
  • add lambda for semantic losses (aab53fe)
  • add LPIPS metric (f1e0526)
  • add miou compute to tests (c0033ef)
  • add new metrics (f3c84cd)
  • add palette model (b7db294)
  • add psnr metric (7135458)
  • add sampling options to test (a2958dc)
  • add SRC and hDCE losses (ddfcc97)
  • add test for doc generation (41526f8)
  • add test on cycle_gan_semantic_mask (3eeff76)
  • add tests for reference image dataloaders (ae6405e)
  • added D noise to CUT with semantics (31aa4a3)
  • added optimizers and options (505cac2)
  • allow control of projected discriminator interpolation (dbffec5)
  • allow ViT custom resolution at D projector init (82e6e83)
  • api: display current commit at startup (6f90be8)
  • aug: affine transforms for semantics (170b0f8)
  • aug: configurable online mask delta augmentation by x and y axis (dfa6459)
  • aug: select bbox category through the path sanitization functionality (a8d3f48)
  • auto download segformer weights (083cc5e)
  • backward while computing MSE criterion loss (1b87906)
  • bbox as sam prompt (a39c5bd)
  • bbox prompt for sam (1fa9cae)
  • bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
  • bw model export (8e43efa)
  • check code format when PR (eeb56cb)
  • choices for canny random thresholds (9573fc1)
  • class weights for semantic segmentation network with cross entropy (4274f1e)
  • classifier training on domain B (fa343c0)
  • commandline saving (6eb503e)
  • commandline script for joligan server calls (48ae23b)
  • compute_feats for unet G (9f1109e)
  • conditioning for palette (b9854ee)
  • config json for client script (174dce9)
  • context for D (b0d3c7b)
  • contrastive classifier noise (7193e0e)
  • contrastive loss for D (deb2ec4)
  • cut_semantic model (b20a943)
  • D accuracy (26ead91)
  • data: random noise in images for object insertion (42cf13d)
  • DDP (68f24da)
  • deceiving D for GAN training (2e2113f)
  • depth model as projector (10ffc28)
  • depth prediction and depth discriminator (01bc62b)
  • diff augment (054509c)
  • diffusion inference with old and new models (9c4c5a9)
  • display augmented images (2126253)
  • display test images (a1de083)
  • doc options auto update (1b08f92)
  • doc: add JSON config examples (5332213)
  • doc: basic server REST API (a757d17)
  • doc: datasets (dfe2343)
  • doc: DDPM conditioning training and inference examples (e694a29)
  • doc: models (be1fe34)
  • doc: refactored README with links to documentation (b5bf121)
  • doc: reference image conditioning (70aeb32)
  • doc: remove overview (2360527)
  • doc: server, client, docker (68a5b96)
  • doc: tips (3fea9ca)
  • doc: training (a4b720d)
  • doc: update inference models and examples (3c43a7b)
  • doc: updated FAQ (88b417c)
  • doc: updated model export (e692f78)
  • edge detection techniques (78202ea)
  • export for unet_mha (b4c3cfd)
  • extract bbox from img (fb64ef0)
  • first recut model (aaa4069)
  • first test (4ac8cd9)
  • fixed bbox size for online creation and bbox size randomization (5cd6227)
  • G weights export during traing (4c045e6)
  • generic image augmentation (d2ceb81)
  • get_schema uses default instead of choices (not always available) (a779b2a)
  • global models (3819a40)
  • inverted mask for automatic background inpainting (39f9ff2)
  • itersize option for cut_model and cut_semantic_mask_model (b3b9e7a)
  • list all available models in help (06a6259)
  • load segformer torchscript weights (672b341)
  • loss values saved in json file and file to display it later (d0fa9a5)
  • madgrad optimizer (1c410f2)
  • metrics for testset (c875f2b)
  • miou compute for f_s pred (5851566)
  • ml: ability to train in wavelet space, similar to 2102.06108 (28077d4)
  • ml: added semantic threshold option (03f33a2)
  • ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d)
  • ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07)
  • ml: classifier-free guidance for image-to-image DDPM (a6425ef)
  • ml: controle over image generation with diffusion in-painting model (0a5ed86)
  • ml: DDIM scheduler (443e7d7)
  • ml: diffusion super-resolution for palette (66c591f)
  • ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b)
  • ml: Discriminator based on SAM (e38d501)
  • ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e)
  • ml: efficient UNet / 2205.11487 (d690ab4)
  • ml: exponential moving average for G (ad33796)
  • ml: GAN mask generator with sam refined target (0cd1ee9)
  • ml: grayscale support for unaligned with semantics (a79548f)
  • ml: initial sam generated masks for unaligned datasets (5c64440)
  • ml: ittr generator control over the number of blocks (ea7390e)
  • ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e)
  • ml: L1 loss for palette diffusion model (d2bb2a5)
  • ml: Lion optimizer (4b99204)
  • ml: mask conditioning for palette (8e71bb7)
  • ml: mask generation across domains for GAN semantics (cf4890b)
  • ml: optim weigth decay parameter control (34fb2dd)
  • ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44)
  • ml: torchvision model support for semantic classification (341205c)
  • ml: UNet/UViT resnet block finer control (9bb979c)
  • ml: UViT for GAN (b1b3607)
  • ml: ViT-14L as projected discriminator (dab27d2)
  • mobile attention resnet (7b8fd87)
  • MobileSam implementation (ea11745)
  • MobileSam implementation (e0d3a67)
  • more options for image generation with diffusion (cd08411)
  • MSE identity loss for cut models (28bfbfd)
  • multi head attention unet generator (9e4c232)
  • multimodal generator with cLR-GAN strategy (c3919b9)
  • multiple D support (578f709)
  • multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44)
  • multiscale loss for diff (0af1a80)
  • nb of images used for FID computation can be chosen (233634b)
  • online dataset creation (9883eac)
  • option for number of inferences (palette model) (7e8bcc9)
  • option to choose metrics to compute (ab1074a)
  • option to choose norm in unet_mha (77d0161)
  • option to compute semantic G loss on f_s output (9ef922d)
  • option to display real image - fake image (b8a134a)
  • option to go through resnet blocks twice (backward compatibility) (1b3330b)
  • option to select embedding network for ref conditioning (2c9933d)
  • option to use a different f_s for domain B (6c55c26)
  • options for aspect ratio and augmentation of cond image (7bf9c95)
  • options.rst gen script (e13a9d8)
  • precommit black formatting (ef0c883)
  • pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a)
  • previous frame as y cond in palette (99de095)
  • projected discriminator (92c0aa0)
  • quickstart (0aa50e4)
  • quickstart update ddpm (98cef11)
  • random perspective and bilinear interpolation with diffaug (d2633c8)
  • readthedocs initial commit (2a57b6f)
  • reference image conditioning (8618129)
  • rename cur_mask (53566c6)
  • reorganizing display of images (bed57bd)
  • resnet G for diffusion (6d1cddc)
  • resnet_attn for cut model (7d01972)
  • sam for mask refinement and edges (7e55165)
  • sanitized paths can be saved and loaded (2f9f42f)
  • save latest images (1f5fb69)
  • saving options json file (a9452ce)
  • script to compute metrics (f5ab623)
  • script to generate diffusion video (574e149)
  • script to remove useless models weights (543f280)
  • scripts to export generator to JIT and to transform a single image (f3f5c37)
  • second D for all models (7ec82be)
  • segformer can be used for attention masks generation (98a3d6e)
  • semantic loss on identity (91df963)
  • server: add commit number in swagger documentation (3c96bc0)
  • server: add synchronous training (5e3291e)
  • server: group options into categories (9358007)
  • set mininum context via crop/bbox ratio (094b213)
  • single bbox per crop for inpainting while training (ac2b4d6)
  • sphinx rtd theme initial (6b2ac14)
  • stylegan network from CUT repo (9e36c55)
  • support for amp on forward ops (dffbb36)
  • support TensorRT generator models in inference with DeepDetect (980ad49)
  • sync loss only when printing (ff8de5e)
  • temporal criterion (c0dbf78)
  • temporal discriminator (154da72)
  • temp: reduce image size during test (7a7e4a3)
  • tests with different f_s net (8d99c72)
  • tf32 support (with Ampere GPUs) (c65622c)
  • timm models as feature extractor for projected D (498a11c)
  • use label 0 in domain B for object removal (d3e2399)
  • use the trained discriminator to rank generated images (65d6416)
  • use torchvision.transforms for differentiable augmentations (dc46733)
  • UViT for diffusion (b427fa8)
  • working inference tutorials (23c8ab6)

Bug Fixes

  • 1 iteration = 1 image (0b096f8)
  • adam/radam beta1 default value (db0e2dd)
  • add clamping in tensor2im function (9537eac)
  • add einops to requirements_github_actions (b30895d)
  • add exec flag to run.sh (f6350d7)
  • add missing architectures to options doc (5c31e1f)
  • add missing test file (9e520cd)
  • add mmcv to requirements for doc generation (2c8ef04)
  • add mmseg to requirements for doc generation (9d83ffe)
  • add padding type to jit export (ae3e23c)
  • add phase to dataloaders arg (3992136)
  • add reflectionpad to fix resnet (e702930)
  • add tqdm to requirements (6e34f72)
  • add type for nb_mask options (7bce61c)
  • add vision aided to doc auto requirements (5fbb13f)
  • add wget to github action requirements (ad7b3d7)
  • all_classes_as_one option (df01988)
  • allow for domain B to have no semantic labels (42604aa)
  • allow for image reading failures (d7f5d08)
  • allow to freeze networks with DDP (d12bd7e)
  • alternate optimizing G and E (d76ba1b)
  • APA_img is only added to visual names APA option is true (309592e)
  • attention masks display (cc5b6b8)
  • auto resize of attention masks (e3bf86a)
  • avoid lambda in RandomImgAug that prevents DDP (a206f91)
  • B label clamping override + display colormap (6fc1dd4)
  • batchnorm for single gpu (93b164c)
  • bbox and mask for diffusion generation script (390fe5d)
  • bbox ref idx from crop_image return in inference code (be0a1b5)
  • broken export onnx script (8080036)
  • broken test + bbox_ref_id option in diffusion single image generation script (179b3ca)
  • build docker file (c055385)
  • catch image reading errors (0153eb4)
  • circumvent isTrain in base options (2eb905e)
  • clean code (b83282b)
  • cls net was created with f_s options (70e9009)
  • cls_class_weights default to [] (ad719f6)
  • computation of out_mask loss when multigpu (31b7294)
  • compute_temporal_fake defined twice (83e0d2d)
  • cond inference (77c0444)
  • context pixel for sanitize paths (d18cee9)
  • correct urls in quickstart_gan.rst (580353f)
  • cpu device for single image generation script (bb3c70c)
  • criterion temporal loss display (48d4c27)
  • criterionIdt is not used by cut (3665410)
  • crop_image for non square images and large bboxes (afdde75)
  • cyclegan semantic mask failure with f_s (0688974)
  • D global requires grad were not set true before backward (6a55914)
  • D noise in cut base model (c6a2114)
  • D_global optimization (284d8ae)
  • data dept init (8a0707e)
  • dataaug_D_diffusion option (70628d1)
  • dataset created once, one dataloader per gpu (70348ff)
  • DDIM restoration when batch_size > 1 (ccb445b)
  • ddpm inference with sam masks + inverted masks (3055910)
  • default --cls_config_segformer value (caadf7a)
  • default nb_mask_input value (632a0f6)
  • dependencies to docker build file (76eb760)
  • diff aug options (a337990)
  • diffusion generation script is already multimodal (d49e66a)
  • diffusion inference with mask_in (dea27ac)
  • diffusion video generation fixes to generate API changes (2d74f7a)
  • diffusion with temporal data loader (c328acb)
  • display for palette model (2a98cf7)
  • doc_gen for palette model (32ef0fa)
  • doc: badly formatted title (8b26c87)
  • docker server source image (b29eb4f)
  • doc: missing losses (f74c373)
  • doc: missing models (a9bddd4)
  • doc: remove newlines and fix path (2f0aa5a)
  • doc: title format (f580d88)
  • export & inference scripts for new encoder/decoder architectures (7993544)
  • f_s backward when multigpu (f8b6067)
  • f_s zero_grad in cut_semantic_mask (0e1c2b3)
  • feats went through resnet blocks 2 times and rm old code (7d768b3)
  • fid for cut and for relative path (57dbabf)
  • flush print() in joligan_api.py (0880d85)
  • force RGB in online creation (15efa68)
  • force torch to stable 1.13.1 (7fffaf0)
  • gan_networks import for inference (09f2969)
  • generate doc script (01f25af)
  • generation script for non multimodal models (a2ff20b)
  • generation scripts help (d1102ff)
  • get schema for * options (d7e33eb)
  • get schema should not compute base_gan_model (5dd6662)
  • get_feats feature extractor for projected D (212751d)
  • guidance scale option for diffusion inference (846e4f5)
  • help command (727fc15)
  • help for diffusion inference script (d85fcc9)
  • image generation script with torch model (4b67c05)
  • image size control in generation script (01ad67d)
  • image size is unused in gen_single_image (b7ea7f4)
  • img visu for palette models (668b253)
  • imgaug option with no mask (f09d484)
  • import signal module (e0d1c52)
  • improved diffusion inference scripts, including video generation (485fad9)
  • in dataloader, warning when class > nclasses (e9231eb)
  • in place gradient error with f_s (283cb85)
  • inference beta_schedule location (ca910d8)
  • inference when no mask cond (e19505d)
  • inference_num for reference img (e444ba1)
  • input_B_label loading when available (83490b3)
  • int are considered as float if needed and warning if remaining keys in options (ad6d99e)
  • inverted BtoA direction labels, moving direction to datasets (4ac2845)
  • jenkins docker perms (ebad13f)
  • jit and onnx export scripts post refactor (32e0912)
  • joligan api launch training (5ea68a4)
  • lambda for GAN losses (020435a)
  • license year (2424de0)
  • loading image with transparency (3bcc7fd)
  • loss values averaged accross gpus and compute last step (f941595)
  • macs to flops (69a02db)
  • make aligned dataset work again (61741cc)
  • mask and class conditioning in generation script (38631b8)
  • mask clamping and display for diff models (4b24fe4)
  • mask display when more than one class (fa21b2e)
  • mask_delta checks (ef095f1)
  • mask_delta for inference diffusion (69bf8ef)
  • mask_delta ratio online check (341dd73)
  • missing get_weights for f_s (835ab4f)
  • missing help on projected D arch (c65f2d2)
  • missing input_nc in f_s segformer constructor (309f75f)
  • missing shape attribute in conditional (9e3d365)
  • ml: allow semantic loss weight control (8e552be)
  • ml: allow torchvision semantic model backbones to work with bw images (d112532)
  • ml: control of cls semantic classifier learning rate (be566c6)
  • ml: DDIM with reference image (f06e47c)
  • ml: diffusion schedule with generation script (4c26b09)
  • ml: semantic regression loss tensor dims (1ddedc6)
  • mobile netG option (4c3db79)
  • modify command line for palette model (2287054)
  • multigpu for cut model (ad95e67)
  • multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a)
  • multigpu for cut_semantic_mask model (4fe3ee2)
  • multimodal GAN requires retain_graph for now (7be1628)
  • nb attn variable after refactor (3e33fcc)
  • nce layers for ittr generator (916b475)
  • NCE with segformer architecture (c037912)
  • netF is initialized on the right gpu (7856bd9)
  • netF weights updating (a75544e)
  • network loading (dcb25d7)
  • networks groups inherit from other models (f988002)
  • new options names for unaligned_labeled_mask (6fed55c)
  • no need for retain_graph to be True on losses (b715638)
  • nvidia key rotation (eadd702)
  • online creation and selfsupervised (42dd435)
  • online creation for temporal dataloader (9cf3d0a)
  • online creation when cropsize > imgsize (c2fc34e)
  • online mask loading with multiple bboxes (1d7a6e1)
  • online mask padding (90d7444)
  • only load testset on gpu 0 (abcb4c4)
  • only one gpu visible in each process (4954717)
  • only onnx export fort segformer G (543ce28)
  • onnx export (e9d36a8)
  • option to not set device when parsing json config file (092b5f8)
  • option type and max int value (d7f60f2)
  • out mask images can be computed with more than 2 semantic classes (36d54b9)
  • output_display_type default value should be a list (d694d29)
  • palette inference without ref image (02d3f76)
  • projection interpolation at init (2aecf46)
  • proper categories titles in schema + improve option saving (bbc41c6)
  • python html lib override (be2b089)
  • random image aug class argument (2d8f686)
  • random offset missing in crop_image (1fc4bf4)
  • real image f_s pred name (d40109c)
  • remove environment.yml (783cae8)
  • remove eval mode for export (9bdae8d)
  • remove last batch norm to cls for batch size 1 (d7fe727)
  • remove use_resize from segformer forward (c71d4de)
  • requirements for github actions (6227c06)
  • requirements for github actions (4fe34ce)
  • resnet attn class call (17f907f)
  • resnet default number of blocks with diffusion (3d34368)
  • resnet with attention class name (950237d)
  • resnet_attn model name (eb4216e)
  • resnet_attn option (08b61ea)
  • reverse DDIM schedule (17ba96a)
  • round pixel gap before offset sampling (05e8959)
  • rst autodoc (5309cc2)
  • rst format and typos (f20f264)
  • sam compatible with mask inversion in palette (4294679)
  • sam inference as discriminator (5708832)
  • sample runs and default options (f08d132)
  • save generated bbox only when useful in diffusion inference (4b93cf3)
  • save metrics plots and allow resuming (095f040)
  • save model every epoch is default (51c5d34)
  • save networks img for diffusion (090df33)
  • saving paths_sanitized in checkpoints dir (49c5139)
  • script for image gen using diffusion (e909525)
  • segformer as f_s net (d55d63d)
  • segformer for semantics with optional partially pre-trained model (3078e51)
  • segformer G and segformer feature extractor (85171e1)
  • segformer ONNX export and image generation script (c5a700b)
  • selfsupervised dataloaders don't need domain B anymore (257a056)
  • sem compute for cls on identity (6a87858)
  • semantic losses was initialized twice (3fb85ab)
  • server launching in generate api doc (f459842)
  • server: wrong import (cb4087e)
  • smaller batchsize at end of epochs (c49ddd1)
  • smaller miou interval to trigger computing (76382d0)
  • softmax G semantic was applied twice (5c4c3ec)
  • start super-resolution restoration from noise (3bce4a9)
  • stylegan2 feature extraction (e442021)
  • support for no mask in B domain (6dd98d6)
  • sv network as latest even with train_save_by_iter (9016d3f)
  • temp: norm default (d95d10f)
  • temporal criterion loss compute (4ddce8b)
  • temporal D and context pixels compatibility (421f458)
  • temporal dataloader, with added path sanitize (6451db2)
  • temporal dataloading and loss computing (07706d3)
  • temporal discriminator with masks and bboxes (43015da)
  • temporal end of sequence (e08ffba)
  • temporal mse with GAN models (79b0d14)
  • test functions names (2247f18)
  • tests (fabd8b6)
  • tests for all f_s net (144d7fd)
  • tests with no cache (787f27b)
  • training examples with mask_delta (9b14339)
  • training examples with mask_delta (8d08839)
  • typo (4274784)
  • typo (f1dcff7)
  • typo in README.md (f5ef15e)
  • typos (5b061f2)
  • typos and class name (6462374)
  • unaligned data_dataset_mode in README (5e7fbd9)
  • underflow in custom cut l2norm, replaced by torch built-in (ff42852)
  • UNet/UViT layers for cut NCE (8459876)
  • unset index_B for unaligned mask offline datasets (8ba5e12)
  • update python version for pre-commit (900ab17)
  • Update README.md (cb9031d)
  • Update README.md (7e5fe82)
  • use only mask from selected bbox at inference time (dd4fe3b)
  • use signal handler to kill all the processes (fab93ef)
  • using cut with F as pure sampling function (887b9a5)
  • validation with no label, e.g. simple cut model (53984f4)
  • visdom autostart and no display (fbc9942)
  • visdom port for server launching (747d18a)
  • visuals for cyclegan (b7cabaa)
  • vitclip16 requires 224 input size (8b8bb67)
  • wrong D_noise option (eaea415)
  • wrong feature network in projected (3318c77)
  • wrong indentation (bac6ce2)
  • wrong number of model forward signature parameters check (868b55b)
  • wrong option for data sanitize (479242e)
  • wrong option name in test and add comment in json option loading (1a7abd6)
  • wrong option prefix (711fa7d)
  • wrong options (4298071)
  • wrong options for validation loading and fid computing (a9d2516)
  • wrong train_compute_fid option (893c0e2)
  • wrong values for bbox ref coordinates (dc2a00f)

1.0.0 (2023-10-06)

Features

  • add a server endpoint to delete files (30b2143)
  • add choices for all options (ed43b82)
  • add ddim inference (0196134)
  • add DDPM tutorial on the VITON-HD dataset (c932d73)
  • add FastAPI server to run training (f517462)
  • add lambda for semantic losses (aab53fe)
  • add LPIPS metric (f1e0526)
  • add miou compute to tests (c0033ef)
  • add new metrics (f3c84cd)
  • add palette model (b7db294)
  • add psnr metric (7135458)
  • add sampling options to test (a2958dc)
  • add SRC and hDCE losses (ddfcc97)
  • add test for doc generation (41526f8)
  • add test on cycle_gan_semantic_mask (3eeff76)
  • add tests for reference image dataloaders (ae6405e)
  • added D noise to CUT with semantics (31aa4a3)
  • added optimizers and options (505cac2)
  • allow control of projected discriminator interpolation (dbffec5)
  • allow ViT custom resolution at D projector init (82e6e83)
  • api: display current commit at startup (6f90be8)
  • aug: affine transforms for semantics (170b0f8)
  • aug: configurable online mask delta augmentation by x and y axis (dfa6459)
  • aug: select bbox category through the path sanitization functionality (a8d3f48)
  • auto download segformer weights (083cc5e)
  • backward while computing MSE criterion loss (1b87906)
  • bbox as sam prompt (a39c5bd)
  • bbox prompt for sam (1fa9cae)
  • bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
  • bw model export (8e43efa)
  • check code format when PR (eeb56cb)
  • choices for canny random thresholds (9573fc1)
  • class weights for semantic segmentation network with cross entropy (4274f1e)
  • classifier training on domain B (fa343c0)
  • commandline saving (6eb503e)
  • commandline script for joligan server calls (48ae23b)
  • compute_feats for unet G (9f1109e)
  • conditioning for palette (b9854ee)
  • config json for client script (174dce9)
  • context for D (b0d3c7b)
  • contrastive classifier noise (7193e0e)
  • contrastive loss for D (deb2ec4)
  • cut_semantic model (b20a943)
  • D accuracy (26ead91)
  • data: random noise in images for object insertion (42cf13d)
  • DDP (68f24da)
  • deceiving D for GAN training (2e2113f)
  • depth model as projector (10ffc28)
  • depth prediction and depth discriminator (01bc62b)