joliGEN: Generative AI Toolset (Changelog)

1.0.0 (2023-10-06)

Docker

GPU (CUDA only): `docker pull docker.jolibrain.com/joligen_server:v1.0.0`
All images available from https://docker.jolibrain.com/#!/taglist/joligen_server

Features

add a server endpoint to delete files (30b2143)
add choices for all options (ed43b82)
add ddim inference (0196134)
add DDPM tutorial on the VITON-HD dataset (c932d73)
add FastAPI server to run training (f517462)
add lambda for semantic losses (aab53fe)
add LPIPS metric (f1e0526)
add miou compute to tests (c0033ef)
add new metrics (f3c84cd)
add palette model (b7db294)
add psnr metric (7135458)
add sampling options to test (a2958dc)
add SRC and hDCE losses (ddfcc97)
add test for doc generation (41526f8)
add test on cycle_gan_semantic_mask (3eeff76)
add tests for reference image dataloaders (ae6405e)
added D noise to CUT with semantics (31aa4a3)
added optimizers and options (505cac2)
allow control of projected discriminator interpolation (dbffec5)
allow ViT custom resolution at D projector init (82e6e83)
api: display current commit at startup (6f90be8)
aug: affine transforms for semantics (170b0f8)
aug: configurable online mask delta augmentation by x and y axis (dfa6459)
aug: select bbox category through the path sanitization functionality (a8d3f48)
auto download segformer weights (083cc5e)
backward while computing MSE criterion loss (1b87906)
bbox as sam prompt (a39c5bd)
bbox prompt for sam (1fa9cae)
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
bw model export (8e43efa)
check code format when PR (eeb56cb)
choices for canny random thresholds (9573fc1)
class weights for semantic segmentation network with cross entropy (4274f1e)
classifier training on domain B (fa343c0)
commandline saving (6eb503e)
commandline script for joligan server calls (48ae23b)
compute_feats for unet G (9f1109e)
conditioning for palette (b9854ee)
config json for client script (174dce9)
context for D (b0d3c7b)
contrastive classifier noise (7193e0e)
contrastive loss for D (deb2ec4)
cut_semantic model (b20a943)
D accuracy (26ead91)
data: random noise in images for object insertion (42cf13d)
DDP (68f24da)
deceiving D for GAN training (2e2113f)
depth model as projector (10ffc28)
depth prediction and depth discriminator (01bc62b)
diff augment (054509c)
diffusion inference with old and new models (9c4c5a9)
display augmented images (2126253)
display test images (a1de083)
doc options auto update (1b08f92)
doc: add JSON config examples (5332213)
doc: basic server REST API (a757d17)
doc: datasets (dfe2343)
doc: DDPM conditioning training and inference examples (e694a29)
doc: models (be1fe34)
doc: refactored README with links to documentation (b5bf121)
doc: reference image conditioning (70aeb32)
doc: remove overview (2360527)
doc: server, client, docker (68a5b96)
doc: tips (3fea9ca)
doc: training (a4b720d)
doc: update inference models and examples (3c43a7b)
doc: updated FAQ (88b417c)
doc: updated model export (e692f78)
edge detection techniques (78202ea)
export for unet_mha (b4c3cfd)
extract bbox from img (fb64ef0)
first recut model (aaa4069)
first test (4ac8cd9)
fixed bbox size for online creation and bbox size randomization (5cd6227)
G weights export during traing (4c045e6)
generic image augmentation (d2ceb81)
get_schema uses default instead of choices (not always available) (a779b2a)
global models (3819a40)
inverted mask for automatic background inpainting (39f9ff2)
itersize option for cut_model and cut_semantic_mask_model (b3b9e7a)
list all available models in help (06a6259)
load segformer torchscript weights (672b341)
loss values saved in json file and file to display it later (d0fa9a5)
madgrad optimizer (1c410f2)
metrics for testset (c875f2b)
miou compute for f_s pred (5851566)
ml: ability to train in wavelet space, similar to 2102.06108 (28077d4)
ml: added semantic threshold option (03f33a2)
ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d)
ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07)
ml: classifier-free guidance for image-to-image DDPM (a6425ef)
ml: controle over image generation with diffusion in-painting model (0a5ed86)
ml: DDIM scheduler (443e7d7)
ml: diffusion super-resolution for palette (66c591f)
ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b)
ml: Discriminator based on SAM (e38d501)
ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e)
ml: efficient UNet / 2205.11487 (d690ab4)
ml: exponential moving average for G (ad33796)
ml: GAN mask generator with sam refined target (0cd1ee9)
ml: grayscale support for unaligned with semantics (a79548f)
ml: initial sam generated masks for unaligned datasets (5c64440)
ml: ittr generator control over the number of blocks (ea7390e)
ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e)
ml: L1 loss for palette diffusion model (d2bb2a5)
ml: Lion optimizer (4b99204)
ml: mask conditioning for palette (8e71bb7)
ml: mask generation across domains for GAN semantics (cf4890b)
ml: optim weigth decay parameter control (34fb2dd)
ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44)
ml: torchvision model support for semantic classification (341205c)
ml: UNet/UViT resnet block finer control (9bb979c)
ml: UViT for GAN (b1b3607)
ml: ViT-14L as projected discriminator (dab27d2)
mobile attention resnet (7b8fd87)
MobileSam implementation (ea11745)
MobileSam implementation (e0d3a67)
more options for image generation with diffusion (cd08411)
MSE identity loss for cut models (28bfbfd)
multi head attention unet generator (9e4c232)
multimodal generator with cLR-GAN strategy (c3919b9)
multiple D support (578f709)
multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44)
multiscale loss for diff (0af1a80)
nb of images used for FID computation can be chosen (233634b)
online dataset creation (9883eac)
option for number of inferences (palette model) (7e8bcc9)
option to choose metrics to compute (ab1074a)
option to choose norm in unet_mha (77d0161)
option to compute semantic G loss on f_s output (9ef922d)
option to display real image - fake image (b8a134a)
option to go through resnet blocks twice (backward compatibility) (1b3330b)
option to select embedding network for ref conditioning (2c9933d)
option to use a different f_s for domain B (6c55c26)
options for aspect ratio and augmentation of cond image (7bf9c95)
options.rst gen script (e13a9d8)
precommit black formatting (ef0c883)
pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a)
previous frame as y cond in palette (99de095)
projected discriminator (92c0aa0)
quickstart (0aa50e4)
quickstart update ddpm (98cef11)
random perspective and bilinear interpolation with diffaug (d2633c8)
readthedocs initial commit (2a57b6f)
reference image conditioning (8618129)
rename cur_mask (53566c6)
reorganizing display of images (bed57bd)
resnet G for diffusion (6d1cddc)
resnet_attn for cut model (7d01972)
sam for mask refinement and edges (7e55165)
sanitized paths can be saved and loaded (2f9f42f)
save latest images (1f5fb69)
saving options json file (a9452ce)
script to compute metrics (f5ab623)
script to generate diffusion video (574e149)
script to remove useless models weights (543f280)
scripts to export generator to JIT and to transform a single image (f3f5c37)
second D for all models (7ec82be)
segformer can be used for attention masks generation (98a3d6e)
semantic loss on identity (91df963)
server: add commit number in swagger documentation (3c96bc0)
server: add synchronous training (5e3291e)
server: group options into categories (9358007)
set mininum context via crop/bbox ratio (094b213)
single bbox per crop for inpainting while training (ac2b4d6)
sphinx rtd theme initial (6b2ac14)
stylegan network from CUT repo (9e36c55)
support for amp on forward ops (dffbb36)
support TensorRT generator models in inference with DeepDetect (980ad49)
sync loss only when printing (ff8de5e)
temporal criterion (c0dbf78)
temporal discriminator (154da72)
temp: reduce image size during test (7a7e4a3)
tests with different f_s net (8d99c72)
tf32 support (with Ampere GPUs) (c65622c)
timm models as feature extractor for projected D (498a11c)
use label 0 in domain B for object removal (d3e2399)
use the trained discriminator to rank generated images (65d6416)
use torchvision.transforms for differentiable augmentations (dc46733)
UViT for diffusion (b427fa8)
working inference tutorials (23c8ab6)

Bug Fixes

1 iteration = 1 image (0b096f8)
adam/radam beta1 default value (db0e2dd)
add clamping in tensor2im function (9537eac)
add einops to requirements_github_actions (b30895d)
add exec flag to run.sh (f6350d7)
add missing architectures to options doc (5c31e1f)
add missing test file (9e520cd)
add mmcv to requirements for doc generation (2c8ef04)
add mmseg to requirements for doc generation (9d83ffe)
add padding type to jit export (ae3e23c)
add phase to dataloaders arg (3992136)
add reflectionpad to fix resnet (e702930)
add tqdm to requirements (6e34f72)
add type for nb_mask options (7bce61c)
add vision aided to doc auto requirements (5fbb13f)
add wget to github action requirements (ad7b3d7)
all_classes_as_one option (df01988)
allow for domain B to have no semantic labels (42604aa)
allow for image reading failures (d7f5d08)
allow to freeze networks with DDP (d12bd7e)
alternate optimizing G and E (d76ba1b)
APA_img is only added to visual names APA option is true (309592e)
attention masks display (cc5b6b8)
auto resize of attention masks (e3bf86a)
avoid lambda in RandomImgAug that prevents DDP (a206f91)
B label clamping override + display colormap (6fc1dd4)
batchnorm for single gpu (93b164c)
bbox and mask for diffusion generation script (390fe5d)
bbox ref idx from crop_image return in inference code (be0a1b5)
broken export onnx script (8080036)
broken test + bbox_ref_id option in diffusion single image generation script (179b3ca)
build docker file (c055385)
catch image reading errors (0153eb4)
circumvent isTrain in base options (2eb905e)
clean code (b83282b)
cls net was created with f_s options (70e9009)
cls_class_weights default to [] (ad719f6)
computation of out_mask loss when multigpu (31b7294)
compute_temporal_fake defined twice (83e0d2d)
cond inference (77c0444)
context pixel for sanitize paths (d18cee9)
correct urls in quickstart_gan.rst (580353f)
cpu device for single image generation script (bb3c70c)
criterion temporal loss display (48d4c27)
criterionIdt is not used by cut (3665410)
crop_image for non square images and large bboxes (afdde75)
cyclegan semantic mask failure with f_s (0688974)
D global requires grad were not set true before backward (6a55914)
D noise in cut base model (c6a2114)
D_global optimization (284d8ae)
data dept init (8a0707e)
dataaug_D_diffusion option (70628d1)
dataset created once, one dataloader per gpu (70348ff)
DDIM restoration when batch_size > 1 (ccb445b)
ddpm inference with sam masks + inverted masks (3055910)
default --cls_config_segformer value (caadf7a)
default nb_mask_input value (632a0f6)
dependencies to docker build file (76eb760)
diff aug options (a337990)
diffusion generation script is already multimodal (d49e66a)
diffusion inference with mask_in (dea27ac)
diffusion video generation fixes to generate API changes (2d74f7a)
diffusion with temporal data loader (c328acb)
display for palette model (2a98cf7)
doc_gen for palette model (32ef0fa)
doc: badly formatted title (8b26c87)
docker server source image (b29eb4f)
doc: missing losses (f74c373)
doc: missing models (a9bddd4)
doc: remove newlines and fix path (2f0aa5a)
doc: title format (f580d88)
export & inference scripts for new encoder/decoder architectures (7993544)
f_s backward when multigpu (f8b6067)
f_s zero_grad in cut_semantic_mask (0e1c2b3)
feats went through resnet blocks 2 times and rm old code (7d768b3)
fid for cut and for relative path (57dbabf)
flush print() in joligan_api.py (0880d85)
force RGB in online creation (15efa68)
force torch to stable 1.13.1 (7fffaf0)
gan_networks import for inference (09f2969)
generate doc script (01f25af)
generation script for non multimodal models (a2ff20b)
generation scripts help (d1102ff)
get schema for * options (d7e33eb)
get schema should not compute base_gan_model (5dd6662)
get_feats feature extractor for projected D (212751d)
guidance scale option for diffusion inference (846e4f5)
help command (727fc15)
help for diffusion inference script (d85fcc9)
image generation script with torch model (4b67c05)
image size control in generation script (01ad67d)
image size is unused in gen_single_image (b7ea7f4)
img visu for palette models (668b253)
imgaug option with no mask (f09d484)
import signal module (e0d1c52)
improved diffusion inference scripts, including video generation (485fad9)
in dataloader, warning when class > nclasses (e9231eb)
in place gradient error with f_s (283cb85)
inference beta_schedule location (ca910d8)
inference when no mask cond (e19505d)
inference_num for reference img (e444ba1)
input_B_label loading when available (83490b3)
int are considered as float if needed and warning if remaining keys in options (ad6d99e)
inverted BtoA direction labels, moving direction to datasets (4ac2845)
jenkins docker perms (ebad13f)
jit and onnx export scripts post refactor (32e0912)
joligan api launch training (5ea68a4)
lambda for GAN losses (020435a)
license year (2424de0)
loading image with transparency (3bcc7fd)
loss values averaged accross gpus and compute last step (f941595)
macs to flops (69a02db)
make aligned dataset work again (61741cc)
mask and class conditioning in generation script (38631b8)
mask clamping and display for diff models (4b24fe4)
mask display when more than one class (fa21b2e)
mask_delta checks (ef095f1)
mask_delta for inference diffusion (69bf8ef)
mask_delta ratio online check (341dd73)
missing get_weights for f_s (835ab4f)
missing help on projected D arch (c65f2d2)
missing input_nc in f_s segformer constructor (309f75f)
missing shape attribute in conditional (9e3d365)
ml: allow semantic loss weight control (8e552be)
ml: allow torchvision semantic model backbones to work with bw images (d112532)
ml: control of cls semantic classifier learning rate (be566c6)
ml: DDIM with reference image (f06e47c)
ml: diffusion schedule with generation script (4c26b09)
ml: semantic regression loss tensor dims (1ddedc6)
mobile netG option (4c3db79)
modify command line for palette model (2287054)
multigpu for cut model (ad95e67)
multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a)
multigpu for cut_semantic_mask model (4fe3ee2)
multimodal GAN requires retain_graph for now (7be1628)
nb attn variable after refactor (3e33fcc)
nce layers for ittr generator (916b475)
NCE with segformer architecture (c037912)
netF is initialized on the right gpu (7856bd9)
netF weights updating (a75544e)
network loading (dcb25d7)
networks groups inherit from other models (f988002)
new options names for unaligned_labeled_mask (6fed55c)
no need for retain_graph to be True on losses (b715638)
nvidia key rotation (eadd702)
online creation and selfsupervised (42dd435)
online creation for temporal dataloader (9cf3d0a)
online creation when cropsize > imgsize (c2fc34e)
online mask loading with multiple bboxes (1d7a6e1)
online mask padding (90d7444)
only load testset on gpu 0 (abcb4c4)
only one gpu visible in each process (4954717)
only onnx export fort segformer G (543ce28)
onnx export (e9d36a8)
option to not set device when parsing json config file (092b5f8)
option type and max int value (d7f60f2)
out mask images can be computed with more than 2 semantic classes (36d54b9)
output_display_type default value should be a list (d694d29)
palette inference without ref image (02d3f76)
projection interpolation at init (2aecf46)
proper categories titles in schema + improve option saving (bbc41c6)
python html lib override (be2b089)
random image aug class argument (2d8f686)
random offset missing in crop_image (1fc4bf4)
real image f_s pred name (d40109c)
remove environment.yml (783cae8)
remove eval mode for export (9bdae8d)
remove last batch norm to cls for batch size 1 (d7fe727)
remove use_resize from segformer forward (c71d4de)
requirements for github actions (6227c06)
requirements for github actions (4fe34ce)
resnet attn class call (17f907f)
resnet default number of blocks with diffusion (3d34368)
resnet with attention class name (950237d)
resnet_attn model name (eb4216e)
resnet_attn option (08b61ea)
reverse DDIM schedule (17ba96a)
round pixel gap before offset sampling (05e8959)
rst autodoc (5309cc2)
rst format and typos (f20f264)
sam compatible with mask inversion in palette (4294679)
sam inference as discriminator (5708832)
sample runs and default options (f08d132)
save generated bbox only when useful in diffusion inference (4b93cf3)
save metrics plots and allow resuming (095f040)
save model every epoch is default (51c5d34)
save networks img for diffusion (090df33)
saving paths_sanitized in checkpoints dir (49c5139)
script for image gen using diffusion (e909525)
segformer as f_s net (d55d63d)
segformer for semantics with optional partially pre-trained model (3078e51)
segformer G and segformer feature extractor (85171e1)
segformer ONNX export and image generation script (c5a700b)
selfsupervised dataloaders don't need domain B anymore (257a056)
sem compute for cls on identity (6a87858)
semantic losses was initialized twice (3fb85ab)
server launching in generate api doc (f459842)
server: wrong import (cb4087e)
smaller batchsize at end of epochs (c49ddd1)
smaller miou interval to trigger computing (76382d0)
softmax G semantic was applied twice (5c4c3ec)
start super-resolution restoration from noise (3bce4a9)
stylegan2 feature extraction (e442021)
support for no mask in B domain (6dd98d6)
sv network as latest even with train_save_by_iter (9016d3f)
temp: norm default (d95d10f)
temporal criterion loss compute (4ddce8b)
temporal D and context pixels compatibility (421f458)
temporal dataloader, with added path sanitize (6451db2)
temporal dataloading and loss computing (07706d3)
temporal discriminator with masks and bboxes (43015da)
temporal end of sequence (e08ffba)
temporal mse with GAN models (79b0d14)
test functions names (2247f18)
tests (fabd8b6)
tests for all f_s net (144d7fd)
tests with no cache (787f27b)
training examples with mask_delta (9b14339)
training examples with mask_delta (8d08839)
typo (4274784)
typo (f1dcff7)
typo in README.md (f5ef15e)
typos (5b061f2)
typos and class name (6462374)
unaligned data_dataset_mode in README (5e7fbd9)
underflow in custom cut l2norm, replaced by torch built-in (ff42852)
UNet/UViT layers for cut NCE (8459876)
unset index_B for unaligned mask offline datasets (8ba5e12)
update python version for pre-commit (900ab17)
Update README.md (cb9031d)
Update README.md (7e5fe82)
use only mask from selected bbox at inference time (dd4fe3b)
use signal handler to kill all the processes (fab93ef)
using cut with F as pure sampling function (887b9a5)
validation with no label, e.g. simple cut model (53984f4)
visdom autostart and no display (fbc9942)
visdom port for server launching (747d18a)
visuals for cyclegan (b7cabaa)
vitclip16 requires 224 input size (8b8bb67)
wrong D_noise option (eaea415)
wrong feature network in projected (3318c77)
wrong indentation (bac6ce2)
wrong number of model forward signature parameters check (868b55b)
wrong option for data sanitize (479242e)
wrong option name in test and add comment in json option loading (1a7abd6)
wrong option prefix (711fa7d)
wrong options (4298071)
wrong options for validation loading and fid computing (a9d2516)
wrong train_compute_fid option (893c0e2)
wrong values for bbox ref coordinates (dc2a00f)

2.0.0 (2023-10-06)

Features

add a server endpoint to delete files (30b2143)
add choices for all options (ed43b82)
add ddim inference (0196134)
add DDPM tutorial on the VITON-HD dataset (c932d73)
add FastAPI server to run training (f517462)
add lambda for semantic losses (aab53fe)
add LPIPS metric (f1e0526)
add miou compute to tests (c0033ef)
add new metrics (f3c84cd)
add palette model (b7db294)
add psnr metric (7135458)
add sampling options to test (a2958dc)
add SRC and hDCE losses (ddfcc97)
add test for doc generation (41526f8)
add test on cycle_gan_semantic_mask (3eeff76)
add tests for reference image dataloaders (ae6405e)
added D noise to CUT with semantics (31aa4a3)
added optimizers and options (505cac2)
allow control of projected discriminator interpolation (dbffec5)
allow ViT custom resolution at D projector init (82e6e83)
api: display current commit at startup (6f90be8)
aug: affine transforms for semantics (170b0f8)
aug: configurable online mask delta augmentation by x and y axis (dfa6459)
aug: select bbox category through the path sanitization functionality (a8d3f48)
auto download segformer weights (083cc5e)
backward while computing MSE criterion loss (1b87906)
bbox as sam prompt (a39c5bd)
bbox prompt for sam (1fa9cae)
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
bw model export (8e43efa)
check code format when PR (eeb56cb)
choices for canny random thresholds (9573fc1)
class weights for semantic segmentation network with cross entropy (4274f1e)
classifier training on domain B (fa343c0)
commandline saving (6eb503e)
commandline script for joligan server calls (48ae23b)
compute_feats for unet G (9f1109e)
conditioning for palette (b9854ee)
config json for client script (174dce9)
context for D (b0d3c7b)
contrastive classifier noise (7193e0e)
contrastive loss for D (deb2ec4)
cut_semantic model (b20a943)
D accuracy (26ead91)
data: random noise in images for object insertion (42cf13d)
DDP (68f24da)
deceiving D for GAN training (2e2113f)
depth model as projector (10ffc28)
depth prediction and depth discriminator (01bc62b)
diff augment (054509c)
diffusion inference with old and new models (9c4c5a9)
display augmented images (2126253)
display test images (a1de083)
doc options auto update (1b08f92)
doc: add JSON config examples (5332213)
doc: basic server REST API (a757d17)
doc: datasets (dfe2343)
doc: DDPM conditioning training and inference examples (e694a29)
doc: models (be1fe34)
doc: refactored README with links to documentation (b5bf121)
doc: reference image conditioning (70aeb32)
doc: remove overview (2360527)
doc: server, client, docker (68a5b96)
doc: tips (3fea9ca)
doc: training (a4b720d)
doc: update inference models and examples (3c43a7b)
doc: updated FAQ (88b417c)
doc: updated model export (e692f78)
edge detection techniques (78202ea)
export for unet_mha (b4c3cfd)
extract bbox from img (fb64ef0)
first recut model (aaa4069)
first test (4ac8cd9)
fixed bbox size for online creation and bbox size randomization (5cd6227)
G weights export during traing (4c045e6)
generic image augmentation (d2ceb81)
get_schema uses default instead of choices (not always available) (a779b2a)
global models (3819a40)
inverted mask for automatic background inpainting (39f9ff2)
itersize option for cut_model and cut_semantic_mask_model (b3b9e7a)
list all available models in help (06a6259)
load segformer torchscript weights (672b341)
loss values saved in json file and file to display it later (d0fa9a5)
madgrad optimizer (1c410f2)
metrics for testset (c875f2b)
miou compute for f_s pred (5851566)
ml: ability to train in wavelet space, similar to 2102.06108 (28077d4)
ml: added semantic threshold option (03f33a2)
ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d)
ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07)
ml: classifier-free guidance for image-to-image DDPM (a6425ef)
ml: controle over image generation with diffusion in-painting model (0a5ed86)
ml: DDIM scheduler (443e7d7)
ml: diffusion super-resolution for palette (66c591f)
ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b)
ml: Discriminator based on SAM (e38d501)
ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e)
ml: efficient UNet / 2205.11487 (d690ab4)
ml: exponential moving average for G (ad33796)
ml: GAN mask generator with sam refined target (0cd1ee9)
ml: grayscale support for unaligned with semantics (a79548f)
ml: initial sam generated masks for unaligned datasets (5c64440)
ml: ittr generator control over the number of blocks (ea7390e)
ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e)
ml: L1 loss for palette diffusion model (d2bb2a5)
ml: Lion optimizer (4b99204)
ml: mask conditioning for palette (8e71bb7)
ml: mask generation across domains for GAN semantics (cf4890b)
ml: optim weigth decay parameter control (34fb2dd)
ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44)
ml: torchvision model support for semantic classification (341205c)
ml: UNet/UViT resnet block finer control (9bb979c)
ml: UViT for GAN (b1b3607)
ml: ViT-14L as projected discriminator (dab27d2)
mobile attention resnet (7b8fd87)
MobileSam implementation (ea11745)
MobileSam implementation (e0d3a67)
more options for image generation with diffusion (cd08411)
MSE identity loss for cut models (28bfbfd)
multi head attention unet generator (9e4c232)
multimodal generator with cLR-GAN strategy (c3919b9)
multiple D support (578f709)
multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44)
multiscale loss for diff (0af1a80)
nb of images used for FID computation can be chosen (233634b)
online dataset creation (9883eac)
option for number of inferences (palette model) (7e8bcc9)
option to choose metrics to compute (ab1074a)
option to choose norm in unet_mha (77d0161)
option to compute semantic G loss on f_s output (9ef922d)
option to display real image - fake image (b8a134a)
option to go through resnet blocks twice (backward compatibility) (1b3330b)
option to select embedding network for ref conditioning (2c9933d)
option to use a different f_s for domain B (6c55c26)
options for aspect ratio and augmentation of cond image (7bf9c95)
options.rst gen script (e13a9d8)
precommit black formatting (ef0c883)
pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a)
previous frame as y cond in palette (99de095)
projected discriminator (92c0aa0)
quickstart (0aa50e4)
quickstart update ddpm (98cef11)
random perspective and bilinear interpolation with diffaug (d2633c8)
readthedocs initial commit (2a57b6f)
reference image conditioning (8618129)
rename cur_mask (53566c6)
reorganizing display of images (bed57bd)
resnet G for diffusion (6d1cddc)
resnet_attn for cut model (7d01972)
sam for mask refinement and edges (7e55165)
sanitized paths can be saved and loaded (2f9f42f)
save latest images (1f5fb69)
saving options json file (a9452ce)
script to compute metrics (f5ab623)
script to generate diffusion video (574e149)
script to remove useless models weights (543f280)
scripts to export generator to JIT and to transform a single image (f3f5c37)
second D for all models (7ec82be)
segformer can be used for attention masks generation (98a3d6e)
semantic loss on identity (91df963)
server: add commit number in swagger documentation (3c96bc0)
server: add synchronous training (5e3291e)
server: group options into categories (9358007)
set mininum context via crop/bbox ratio (094b213)
single bbox per crop for inpainting while training (ac2b4d6)
sphinx rtd theme initial (6b2ac14)
stylegan network from CUT repo (9e36c55)
support for amp on forward ops (dffbb36)
support TensorRT generator models in inference with DeepDetect (980ad49)
sync loss only when printing (ff8de5e)
temporal criterion (c0dbf78)
temporal discriminator (154da72)
temp: reduce image size during test (7a7e4a3)
tests with different f_s net (8d99c72)
tf32 support (with Ampere GPUs) (c65622c)
timm models as feature extractor for projected D (498a11c)
use label 0 in domain B for object removal (d3e2399)
use the trained discriminator to rank generated images (65d6416)
use torchvision.transforms for differentiable augmentations (dc46733)
UViT for diffusion (b427fa8)
working inference tutorials (23c8ab6)

Bug Fixes

1 iteration = 1 image (0b096f8)
adam/radam beta1 default value (db0e2dd)
add clamping in tensor2im function (9537eac)
add einops to requirements_github_actions (b30895d)
add exec flag to run.sh (f6350d7)
add missing architectures to options doc (5c31e1f)
add missing test file (9e520cd)
add mmcv to requirements for doc generation (2c8ef04)
add mmseg to requirements for doc generation (9d83ffe)
add padding type to jit export (ae3e23c)
add phase to dataloaders arg (3992136)
add reflectionpad to fix resnet (e702930)
add tqdm to requirements (6e34f72)
add type for nb_mask options (7bce61c)
add vision aided to doc auto requirements (5fbb13f)
add wget to github action requirements (ad7b3d7)
all_classes_as_one option (df01988)
allow for domain B to have no semantic labels (42604aa)
allow for image reading failures (d7f5d08)
allow to freeze networks with DDP (d12bd7e)
alternate optimizing G and E (d76ba1b)
APA_img is only added to visual names APA option is true (309592e)
attention masks display (cc5b6b8)
auto resize of attention masks (e3bf86a)
avoid lambda in RandomImgAug that prevents DDP (a206f91)
B label clamping override + display colormap (6fc1dd4)
batchnorm for single gpu (93b164c)
bbox and mask for diffusion generation script (390fe5d)
bbox ref idx from crop_image return in inference code (be0a1b5)
broken export onnx script (8080036)
broken test + bbox_ref_id option in diffusion single image generation script (179b3ca)
build docker file (c055385)
catch image reading errors (0153eb4)
circumvent isTrain in base options (2eb905e)
clean code (b83282b)
cls net was created with f_s options (70e9009)
cls_class_weights default to [] (ad719f6)
computation of out_mask loss when multigpu (31b7294)
compute_temporal_fake defined twice (83e0d2d)
cond inference (77c0444)
context pixel for sanitize paths (d18cee9)
correct urls in quickstart_gan.rst (580353f)
cpu device for single image generation script (bb3c70c)
criterion temporal loss display (48d4c27)
criterionIdt is not used by cut (3665410)
crop_image for non square images and large bboxes (afdde75)
cyclegan semantic mask failure with f_s (0688974)
D global requires grad were not set true before backward (6a55914)
D noise in cut base model (c6a2114)
D_global optimization (284d8ae)
data dept init (8a0707e)
dataaug_D_diffusion option (70628d1)
dataset created once, one dataloader per gpu (70348ff)
DDIM restoration when batch_size > 1 (ccb445b)
ddpm inference with sam masks + inverted masks (3055910)
default --cls_config_segformer value (caadf7a)
default nb_mask_input value (632a0f6)
dependencies to docker build file (76eb760)
diff aug options (a337990)
diffusion generation script is already multimodal (d49e66a)
diffusion inference with mask_in (dea27ac)
diffusion video generation fixes to generate API changes (2d74f7a)
diffusion with temporal data loader (c328acb)
display for palette model (2a98cf7)
doc_gen for palette model (32ef0fa)
doc: badly formatted title (8b26c87)
docker server source image (b29eb4f)
doc: missing losses (f74c373)
doc: missing models (a9bddd4)
doc: remove newlines and fix path (2f0aa5a)
doc: title format (f580d88)
export & inference scripts for new encoder/decoder architectures (7993544)
f_s backward when multigpu (f8b6067)
f_s zero_grad in cut_semantic_mask (0e1c2b3)
feats went through resnet blocks 2 times and rm old code (7d768b3)
fid for cut and for relative path (57dbabf)
flush print() in joligan_api.py (0880d85)
force RGB in online creation (15efa68)
force torch to stable 1.13.1 (7fffaf0)
gan_networks import for inference (09f2969)
generate doc script (01f25af)
generation script for non multimodal models (a2ff20b)
generation scripts help (d1102ff)
get schema for * options (d7e33eb)
get schema should not compute base_gan_model (5dd6662)
get_feats feature extractor for projected D (212751d)
guidance scale option for diffusion inference (846e4f5)
help command (727fc15)
help for diffusion inference script (d85fcc9)
image generation script with torch model (4b67c05)
image size control in generation script (01ad67d)
image size is unused in gen_single_image (b7ea7f4)
img visu for palette models (668b253)
imgaug option with no mask (f09d484)
import signal module (e0d1c52)
improved diffusion inference scripts, including video generation (485fad9)
in dataloader, warning when class > nclasses (e9231eb)
in place gradient error with f_s (283cb85)
inference beta_schedule location (ca910d8)
inference when no mask cond (e19505d)
inference_num for reference img (e444ba1)
input_B_label loading when available (83490b3)
int are considered as float if needed and warning if remaining keys in options (ad6d99e)
inverted BtoA direction labels, moving direction to datasets (4ac2845)
jenkins docker perms (ebad13f)
jit and onnx export scripts post refactor (32e0912)
joligan api launch training (5ea68a4)
lambda for GAN losses (020435a)
license year (2424de0)
loading image with transparency (3bcc7fd)
loss values averaged accross gpus and compute last step (f941595)
macs to flops (69a02db)
make aligned dataset work again (61741cc)
mask and class conditioning in generation script (38631b8)
mask clamping and display for diff models (4b24fe4)
mask display when more than one class (fa21b2e)
mask_delta checks (ef095f1)
mask_delta for inference diffusion (69bf8ef)
mask_delta ratio online check (341dd73)
missing get_weights for f_s (835ab4f)
missing help on projected D arch (c65f2d2)
missing input_nc in f_s segformer constructor (309f75f)
missing shape attribute in conditional (9e3d365)
ml: allow semantic loss weight control (8e552be)
ml: allow torchvision semantic model backbones to work with bw images (d112532)
ml: control of cls semantic classifier learning rate (be566c6)
ml: DDIM with reference image (f06e47c)
ml: diffusion schedule with generation script (4c26b09)
ml: semantic regression loss tensor dims (1ddedc6)
mobile netG option (4c3db79)
modify command line for palette model (2287054)
multigpu for cut model (ad95e67)
multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a)
multigpu for cut_semantic_mask model (4fe3ee2)
multimodal GAN requires retain_graph for now (7be1628)
nb attn variable after refactor (3e33fcc)
nce layers for ittr generator (916b475)
NCE with segformer architecture (c037912)
netF is initialized on the right gpu (7856bd9)
netF weights updating (a75544e)
network loading (dcb25d7)
networks groups inherit from other models (f988002)
new options names for unaligned_labeled_mask (6fed55c)
no need for retain_graph to be True on losses (b715638)
nvidia key rotation (eadd702)
online creation and selfsupervised (42dd435)
online creation for temporal dataloader (9cf3d0a)
online creation when cropsize > imgsize (c2fc34e)
online mask loading with multiple bboxes (1d7a6e1)
online mask padding (90d7444)
only load testset on gpu 0 (abcb4c4)
only one gpu visible in each process (4954717)
only onnx export fort segformer G (543ce28)
onnx export (e9d36a8)
option to not set device when parsing json config file (092b5f8)
option type and max int value (d7f60f2)
out mask images can be computed with more than 2 semantic classes (36d54b9)
output_display_type default value should be a list (d694d29)
palette inference without ref image (02d3f76)
projection interpolation at init (2aecf46)
proper categories titles in schema + improve option saving (bbc41c6)
python html lib override (be2b089)
random image aug class argument (2d8f686)
random offset missing in crop_image (1fc4bf4)
real image f_s pred name (d40109c)
remove environment.yml (783cae8)
remove eval mode for export (9bdae8d)
remove last batch norm to cls for batch size 1 (d7fe727)
remove use_resize from segformer forward (c71d4de)
requirements for github actions (6227c06)
requirements for github actions (4fe34ce)
resnet attn class call (17f907f)
resnet default number of blocks with diffusion (3d34368)
resnet with attention class name (950237d)
resnet_attn model name (eb4216e)
resnet_attn option (08b61ea)
reverse DDIM schedule (17ba96a)
round pixel gap before offset sampling (05e8959)
rst autodoc (5309cc2)
rst format and typos (f20f264)
sam compatible with mask inversion in palette (4294679)
sam inference as discriminator (5708832)
sample runs and default options (f08d132)
save generated bbox only when useful in diffusion inference (4b93cf3)
save metrics plots and allow resuming (095f040)
save model every epoch is default (51c5d34)
save networks img for diffusion (090df33)
saving paths_sanitized in checkpoints dir (49c5139)
script for image gen using diffusion (e909525)
segformer as f_s net (d55d63d)
segformer for semantics with optional partially pre-trained model (3078e51)
segformer G and segformer feature extractor (85171e1)
segformer ONNX export and image generation script (c5a700b)
selfsupervised dataloaders don't need domain B anymore (257a056)
sem compute for cls on identity (6a87858)
semantic losses was initialized twice (3fb85ab)
server launching in generate api doc (f459842)
server: wrong import (cb4087e)
smaller batchsize at end of epochs (c49ddd1)
smaller miou interval to trigger computing (76382d0)
softmax G semantic was applied twice (5c4c3ec)
start super-resolution restoration from noise (3bce4a9)
stylegan2 feature extraction (e442021)
support for no mask in B domain (6dd98d6)
sv network as latest even with train_save_by_iter (9016d3f)
temp: norm default (d95d10f)
temporal criterion loss compute (4ddce8b)
temporal D and context pixels compatibility (421f458)
temporal dataloader, with added path sanitize (6451db2)
temporal dataloading and loss computing (07706d3)
temporal discriminator with masks and bboxes (43015da)
temporal end of sequence (e08ffba)
temporal mse with GAN models (79b0d14)
test functions names (2247f18)
tests (fabd8b6)
tests for all f_s net (144d7fd)
tests with no cache (787f27b)
training examples with mask_delta (9b14339)
training examples with mask_delta (8d08839)
typo (4274784)
typo (f1dcff7)
typo in README.md (f5ef15e)
typos (5b061f2)
typos and class name (6462374)
unaligned data_dataset_mode in README (5e7fbd9)
underflow in custom cut l2norm, replaced by torch built-in (ff42852)
UNet/UViT layers for cut NCE (8459876)
unset index_B for unaligned mask offline datasets (8ba5e12)
update python version for pre-commit (900ab17)
Update README.md (cb9031d)
Update README.md (7e5fe82)
use only mask from selected bbox at inference time (dd4fe3b)
use signal handler to kill all the processes (fab93ef)
using cut with F as pure sampling function (887b9a5)
validation with no label, e.g. simple cut model (53984f4)
visdom autostart and no display (fbc9942)
visdom port for server launching (747d18a)
visuals for cyclegan (b7cabaa)
vitclip16 requires 224 input size (8b8bb67)
wrong D_noise option (eaea415)
wrong feature network in projected (3318c77)
wrong indentation (bac6ce2)
wrong number of model forward signature parameters check (868b55b)
wrong option for data sanitize (479242e)
wrong option name in test and add comment in json option loading (1a7abd6)
wrong option prefix (711fa7d)
wrong options (4298071)
wrong options for validation loading and fid computing (a9d2516)
wrong train_compute_fid option (893c0e2)
wrong values for bbox ref coordinates (dc2a00f)

1.0.0 (2023-10-06)

Features

add a server endpoint to delete files (30b2143)
add choices for all options (ed43b82)
add ddim inference (0196134)
add DDPM tutorial on the VITON-HD dataset (c932d73)
add FastAPI server to run training (f517462)
add lambda for semantic losses (aab53fe)
add LPIPS metric (f1e0526)
add miou compute to tests (c0033ef)
add new metrics (f3c84cd)
add palette model (b7db294)
add psnr metric (7135458)
add sampling options to test (a2958dc)
add SRC and hDCE losses (ddfcc97)
add test for doc generation (41526f8)
add test on cycle_gan_semantic_mask (3eeff76)
add tests for reference image dataloaders (ae6405e)
added D noise to CUT with semantics (31aa4a3)
added optimizers and options (505cac2)
allow control of projected discriminator interpolation (dbffec5)
allow ViT custom resolution at D projector init (82e6e83)
api: display current commit at startup (6f90be8)
aug: affine transforms for semantics (170b0f8)
aug: configurable online mask delta augmentation by x and y axis (dfa6459)
aug: select bbox category through the path sanitization functionality (a8d3f48)
auto download segformer weights (083cc5e)
backward while computing MSE criterion loss (1b87906)
bbox as sam prompt (a39c5bd)
bbox prompt for sam (1fa9cae)
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494)
bw model export (8e43efa)
check code format when PR (eeb56cb)
choices for canny random thresholds (9573fc1)
class weights for semantic segmentation network with cross entropy (4274f1e)
classifier training on domain B (fa343c0)
commandline saving (6eb503e)
commandline script for joligan server calls (48ae23b)
compute_feats for unet G (9f1109e)
conditioning for palette (b9854ee)
config json for client script (174dce9)
context for D (b0d3c7b)
contrastive classifier noise (7193e0e)
contrastive loss for D (deb2ec4)
cut_semantic model (b20a943)
D accuracy (26ead91)
data: random noise in images for object insertion (42cf13d)
DDP (68f24da)
deceiving D for GAN training (2e2113f)
depth model as projector (10ffc28)
depth prediction and depth discriminator (01bc62b)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0

joliGEN: Generative AI Toolset (Changelog)

1.0.0 (2023-10-06)

Docker

Features

Bug Fixes

2.0.0 (2023-10-06)

Features

Bug Fixes

1.0.0 (2023-10-06)

Features