Releases
v1.0.0
beniz
released this
06 Oct 10:20
joliGEN: Generative AI Toolset (Changelog)
1.0.0 (2023-10-06)
Docker
Features
add a server endpoint to delete files (30b2143 )
add choices for all options (ed43b82 )
add ddim inference (0196134 )
add DDPM tutorial on the VITON-HD dataset (c932d73 )
add FastAPI server to run training (f517462 )
add lambda for semantic losses (aab53fe )
add LPIPS metric (f1e0526 )
add miou compute to tests (c0033ef )
add new metrics (f3c84cd )
add palette model (b7db294 )
add psnr metric (7135458 )
add sampling options to test (a2958dc )
add SRC and hDCE losses (ddfcc97 )
add test for doc generation (41526f8 )
add test on cycle_gan_semantic_mask (3eeff76 )
add tests for reference image dataloaders (ae6405e )
added D noise to CUT with semantics (31aa4a3 )
added optimizers and options (505cac2 )
allow control of projected discriminator interpolation (dbffec5 )
allow ViT custom resolution at D projector init (82e6e83 )
api: display current commit at startup (6f90be8 )
aug: affine transforms for semantics (170b0f8 )
aug: configurable online mask delta augmentation by x and y axis (dfa6459 )
aug: select bbox category through the path sanitization functionality (a8d3f48 )
auto download segformer weights (083cc5e )
backward while computing MSE criterion loss (1b87906 )
bbox as sam prompt (a39c5bd )
bbox prompt for sam (1fa9cae )
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494 )
bw model export (8e43efa )
check code format when PR (eeb56cb )
choices for canny random thresholds (9573fc1 )
class weights for semantic segmentation network with cross entropy (4274f1e )
classifier training on domain B (fa343c0 )
commandline saving (6eb503e )
commandline script for joligan server calls (48ae23b )
compute_feats for unet G (9f1109e )
conditioning for palette (b9854ee )
config json for client script (174dce9 )
context for D (b0d3c7b )
contrastive classifier noise (7193e0e )
contrastive loss for D (deb2ec4 )
cut_semantic model (b20a943 )
D accuracy (26ead91 )
data: random noise in images for object insertion (42cf13d )
DDP (68f24da )
deceiving D for GAN training (2e2113f )
depth model as projector (10ffc28 )
depth prediction and depth discriminator (01bc62b )
diff augment (054509c )
diffusion inference with old and new models (9c4c5a9 )
display augmented images (2126253 )
display test images (a1de083 )
doc options auto update (1b08f92 )
doc: add JSON config examples (5332213 )
doc: basic server REST API (a757d17 )
doc: datasets (dfe2343 )
doc: DDPM conditioning training and inference examples (e694a29 )
doc: models (be1fe34 )
doc: refactored README with links to documentation (b5bf121 )
doc: reference image conditioning (70aeb32 )
doc: remove overview (2360527 )
doc: server, client, docker (68a5b96 )
doc: tips (3fea9ca )
doc: training (a4b720d )
doc: update inference models and examples (3c43a7b )
doc: updated FAQ (88b417c )
doc: updated model export (e692f78 )
edge detection techniques (78202ea )
export for unet_mha (b4c3cfd )
extract bbox from img (fb64ef0 )
first recut model (aaa4069 )
first test (4ac8cd9 )
fixed bbox size for online creation and bbox size randomization (5cd6227 )
G weights export during traing (4c045e6 )
generic image augmentation (d2ceb81 )
get_schema uses default instead of choices (not always available) (a779b2a )
global models (3819a40 )
inverted mask for automatic background inpainting (39f9ff2 )
itersize option for cut_model and cut_semantic_mask_model (b3b9e7a )
list all available models in help (06a6259 )
load segformer torchscript weights (672b341 )
loss values saved in json file and file to display it later (d0fa9a5 )
madgrad optimizer (1c410f2 )
metrics for testset (c875f2b )
miou compute for f_s pred (5851566 )
ml: ability to train in wavelet space, similar to 2102.06108 (28077d4 )
ml: added semantic threshold option (03f33a2 )
ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d )
ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07 )
ml: classifier-free guidance for image-to-image DDPM (a6425ef )
ml: controle over image generation with diffusion in-painting model (0a5ed86 )
ml: DDIM scheduler (443e7d7 )
ml: diffusion super-resolution for palette (66c591f )
ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b )
ml: Discriminator based on SAM (e38d501 )
ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e )
ml: efficient UNet / 2205.11487 (d690ab4 )
ml: exponential moving average for G (ad33796 )
ml: GAN mask generator with sam refined target (0cd1ee9 )
ml: grayscale support for unaligned with semantics (a79548f )
ml: initial sam generated masks for unaligned datasets (5c64440 )
ml: ittr generator control over the number of blocks (ea7390e )
ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e )
ml: L1 loss for palette diffusion model (d2bb2a5 )
ml: Lion optimizer (4b99204 )
ml: mask conditioning for palette (8e71bb7 )
ml: mask generation across domains for GAN semantics (cf4890b )
ml: optim weigth decay parameter control (34fb2dd )
ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44 )
ml: torchvision model support for semantic classification (341205c )
ml: UNet/UViT resnet block finer control (9bb979c )
ml: UViT for GAN (b1b3607 )
ml: ViT-14L as projected discriminator (dab27d2 )
mobile attention resnet (7b8fd87 )
MobileSam implementation (ea11745 )
MobileSam implementation (e0d3a67 )
more options for image generation with diffusion (cd08411 )
MSE identity loss for cut models (28bfbfd )
multi head attention unet generator (9e4c232 )
multimodal generator with cLR-GAN strategy (c3919b9 )
multiple D support (578f709 )
multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44 )
multiscale loss for diff (0af1a80 )
nb of images used for FID computation can be chosen (233634b )
online dataset creation (9883eac )
option for number of inferences (palette model) (7e8bcc9 )
option to choose metrics to compute (ab1074a )
option to choose norm in unet_mha (77d0161 )
option to compute semantic G loss on f_s output (9ef922d )
option to display real image - fake image (b8a134a )
option to go through resnet blocks twice (backward compatibility) (1b3330b )
option to select embedding network for ref conditioning (2c9933d )
option to use a different f_s for domain B (6c55c26 )
options for aspect ratio and augmentation of cond image (7bf9c95 )
options.rst gen script (e13a9d8 )
precommit black formatting (ef0c883 )
pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a )
previous frame as y cond in palette (99de095 )
projected discriminator (92c0aa0 )
quickstart (0aa50e4 )
quickstart update ddpm (98cef11 )
random perspective and bilinear interpolation with diffaug (d2633c8 )
readthedocs initial commit (2a57b6f )
reference image conditioning (8618129 )
rename cur_mask (53566c6 )
reorganizing display of images (bed57bd )
resnet G for diffusion (6d1cddc )
resnet_attn for cut model (7d01972 )
sam for mask refinement and edges (7e55165 )
sanitized paths can be saved and loaded (2f9f42f )
save latest images (1f5fb69 )
saving options json file (a9452ce )
script to compute metrics (f5ab623 )
script to generate diffusion video (574e149 )
script to remove useless models weights (543f280 )
scripts to export generator to JIT and to transform a single image (f3f5c37 )
second D for all models (7ec82be )
segformer can be used for attention masks generation (98a3d6e )
semantic loss on identity (91df963 )
server: add commit number in swagger documentation (3c96bc0 )
server: add synchronous training (5e3291e )
server: group options into categories (9358007 )
set mininum context via crop/bbox ratio (094b213 )
single bbox per crop for inpainting while training (ac2b4d6 )
sphinx rtd theme initial (6b2ac14 )
stylegan network from CUT repo (9e36c55 )
support for amp on forward ops (dffbb36 )
support TensorRT generator models in inference with DeepDetect (980ad49 )
sync loss only when printing (ff8de5e )
temporal criterion (c0dbf78 )
temporal discriminator (154da72 )
temp: reduce image size during test (7a7e4a3 )
tests with different f_s net (8d99c72 )
tf32 support (with Ampere GPUs) (c65622c )
timm models as feature extractor for projected D (498a11c )
use label 0 in domain B for object removal (d3e2399 )
use the trained discriminator to rank generated images (65d6416 )
use torchvision.transforms for differentiable augmentations (dc46733 )
UViT for diffusion (b427fa8 )
working inference tutorials (23c8ab6 )
Bug Fixes
1 iteration = 1 image (0b096f8 )
adam/radam beta1 default value (db0e2dd )
add clamping in tensor2im function (9537eac )
add einops to requirements_github_actions (b30895d )
add exec flag to run.sh (f6350d7 )
add missing architectures to options doc (5c31e1f )
add missing test file (9e520cd )
add mmcv to requirements for doc generation (2c8ef04 )
add mmseg to requirements for doc generation (9d83ffe )
add padding type to jit export (ae3e23c )
add phase to dataloaders arg (3992136 )
add reflectionpad to fix resnet (e702930 )
add tqdm to requirements (6e34f72 )
add type for nb_mask options (7bce61c )
add vision aided to doc auto requirements (5fbb13f )
add wget to github action requirements (ad7b3d7 )
all_classes_as_one option (df01988 )
allow for domain B to have no semantic labels (42604aa )
allow for image reading failures (d7f5d08 )
allow to freeze networks with DDP (d12bd7e )
alternate optimizing G and E (d76ba1b )
APA_img is only added to visual names APA option is true (309592e )
attention masks display (cc5b6b8 )
auto resize of attention masks (e3bf86a )
avoid lambda in RandomImgAug that prevents DDP (a206f91 )
B label clamping override + display colormap (6fc1dd4 )
batchnorm for single gpu (93b164c )
bbox and mask for diffusion generation script (390fe5d )
bbox ref idx from crop_image return in inference code (be0a1b5 )
broken export onnx script (8080036 )
broken test + bbox_ref_id option in diffusion single image generation script (179b3ca )
build docker file (c055385 )
catch image reading errors (0153eb4 )
circumvent isTrain in base options (2eb905e )
clean code (b83282b )
cls net was created with f_s options (70e9009 )
cls_class_weights default to [] (ad719f6 )
computation of out_mask loss when multigpu (31b7294 )
compute_temporal_fake defined twice (83e0d2d )
cond inference (77c0444 )
context pixel for sanitize paths (d18cee9 )
correct urls in quickstart_gan.rst (580353f )
cpu device for single image generation script (bb3c70c )
criterion temporal loss display (48d4c27 )
criterionIdt is not used by cut (3665410 )
crop_image for non square images and large bboxes (afdde75 )
cyclegan semantic mask failure with f_s (0688974 )
D global requires grad were not set true before backward (6a55914 )
D noise in cut base model (c6a2114 )
D_global optimization (284d8ae )
data dept init (8a0707e )
dataaug_D_diffusion option (70628d1 )
dataset created once, one dataloader per gpu (70348ff )
DDIM restoration when batch_size > 1 (ccb445b )
ddpm inference with sam masks + inverted masks (3055910 )
default --cls_config_segformer value (caadf7a )
default nb_mask_input value (632a0f6 )
dependencies to docker build file (76eb760 )
diff aug options (a337990 )
diffusion generation script is already multimodal (d49e66a )
diffusion inference with mask_in (dea27ac )
diffusion video generation fixes to generate API changes (2d74f7a )
diffusion with temporal data loader (c328acb )
display for palette model (2a98cf7 )
doc_gen for palette model (32ef0fa )
doc: badly formatted title (8b26c87 )
docker server source image (b29eb4f )
doc: missing losses (f74c373 )
doc: missing models (a9bddd4 )
doc: remove newlines and fix path (2f0aa5a )
doc: title format (f580d88 )
export & inference scripts for new encoder/decoder architectures (7993544 )
f_s backward when multigpu (f8b6067 )
f_s zero_grad in cut_semantic_mask (0e1c2b3 )
feats went through resnet blocks 2 times and rm old code (7d768b3 )
fid for cut and for relative path (57dbabf )
flush print() in joligan_api.py (0880d85 )
force RGB in online creation (15efa68 )
force torch to stable 1.13.1 (7fffaf0 )
gan_networks import for inference (09f2969 )
generate doc script (01f25af )
generation script for non multimodal models (a2ff20b )
generation scripts help (d1102ff )
get schema for * options (d7e33eb )
get schema should not compute base_gan_model (5dd6662 )
get_feats feature extractor for projected D (212751d )
guidance scale option for diffusion inference (846e4f5 )
help command (727fc15 )
help for diffusion inference script (d85fcc9 )
image generation script with torch model (4b67c05 )
image size control in generation script (01ad67d )
image size is unused in gen_single_image (b7ea7f4 )
img visu for palette models (668b253 )
imgaug option with no mask (f09d484 )
import signal module (e0d1c52 )
improved diffusion inference scripts, including video generation (485fad9 )
in dataloader, warning when class > nclasses (e9231eb )
in place gradient error with f_s (283cb85 )
inference beta_schedule location (ca910d8 )
inference when no mask cond (e19505d )
inference_num for reference img (e444ba1 )
input_B_label loading when available (83490b3 )
int are considered as float if needed and warning if remaining keys in options (ad6d99e )
inverted BtoA direction labels, moving direction to datasets (4ac2845 )
jenkins docker perms (ebad13f )
jit and onnx export scripts post refactor (32e0912 )
joligan api launch training (5ea68a4 )
lambda for GAN losses (020435a )
license year (2424de0 )
loading image with transparency (3bcc7fd )
loss values averaged accross gpus and compute last step (f941595 )
macs to flops (69a02db )
make aligned dataset work again (61741cc )
mask and class conditioning in generation script (38631b8 )
mask clamping and display for diff models (4b24fe4 )
mask display when more than one class (fa21b2e )
mask_delta checks (ef095f1 )
mask_delta for inference diffusion (69bf8ef )
mask_delta ratio online check (341dd73 )
missing get_weights for f_s (835ab4f )
missing help on projected D arch (c65f2d2 )
missing input_nc in f_s segformer constructor (309f75f )
missing shape attribute in conditional (9e3d365 )
ml: allow semantic loss weight control (8e552be )
ml: allow torchvision semantic model backbones to work with bw images (d112532 )
ml: control of cls semantic classifier learning rate (be566c6 )
ml: DDIM with reference image (f06e47c )
ml: diffusion schedule with generation script (4c26b09 )
ml: semantic regression loss tensor dims (1ddedc6 )
mobile netG option (4c3db79 )
modify command line for palette model (2287054 )
multigpu for cut model (ad95e67 )
multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a )
multigpu for cut_semantic_mask model (4fe3ee2 )
multimodal GAN requires retain_graph for now (7be1628 )
nb attn variable after refactor (3e33fcc )
nce layers for ittr generator (916b475 )
NCE with segformer architecture (c037912 )
netF is initialized on the right gpu (7856bd9 )
netF weights updating (a75544e )
network loading (dcb25d7 )
networks groups inherit from other models (f988002 )
new options names for unaligned_labeled_mask (6fed55c )
no need for retain_graph to be True on losses (b715638 )
nvidia key rotation (eadd702 )
online creation and selfsupervised (42dd435 )
online creation for temporal dataloader (9cf3d0a )
online creation when cropsize > imgsize (c2fc34e )
online mask loading with multiple bboxes (1d7a6e1 )
online mask padding (90d7444 )
only load testset on gpu 0 (abcb4c4 )
only one gpu visible in each process (4954717 )
only onnx export fort segformer G (543ce28 )
onnx export (e9d36a8 )
option to not set device when parsing json config file (092b5f8 )
option type and max int value (d7f60f2 )
out mask images can be computed with more than 2 semantic classes (36d54b9 )
output_display_type default value should be a list (d694d29 )
palette inference without ref image (02d3f76 )
projection interpolation at init (2aecf46 )
proper categories titles in schema + improve option saving (bbc41c6 )
python html lib override (be2b089 )
random image aug class argument (2d8f686 )
random offset missing in crop_image (1fc4bf4 )
real image f_s pred name (d40109c )
remove environment.yml (783cae8 )
remove eval mode for export (9bdae8d )
remove last batch norm to cls for batch size 1 (d7fe727 )
remove use_resize from segformer forward (c71d4de )
requirements for github actions (6227c06 )
requirements for github actions (4fe34ce )
resnet attn class call (17f907f )
resnet default number of blocks with diffusion (3d34368 )
resnet with attention class name (950237d )
resnet_attn model name (eb4216e )
resnet_attn option (08b61ea )
reverse DDIM schedule (17ba96a )
round pixel gap before offset sampling (05e8959 )
rst autodoc (5309cc2 )
rst format and typos (f20f264 )
sam compatible with mask inversion in palette (4294679 )
sam inference as discriminator (5708832 )
sample runs and default options (f08d132 )
save generated bbox only when useful in diffusion inference (4b93cf3 )
save metrics plots and allow resuming (095f040 )
save model every epoch is default (51c5d34 )
save networks img for diffusion (090df33 )
saving paths_sanitized in checkpoints dir (49c5139 )
script for image gen using diffusion (e909525 )
segformer as f_s net (d55d63d )
segformer for semantics with optional partially pre-trained model (3078e51 )
segformer G and segformer feature extractor (85171e1 )
segformer ONNX export and image generation script (c5a700b )
selfsupervised dataloaders don't need domain B anymore (257a056 )
sem compute for cls on identity (6a87858 )
semantic losses was initialized twice (3fb85ab )
server launching in generate api doc (f459842 )
server: wrong import (cb4087e )
smaller batchsize at end of epochs (c49ddd1 )
smaller miou interval to trigger computing (76382d0 )
softmax G semantic was applied twice (5c4c3ec )
start super-resolution restoration from noise (3bce4a9 )
stylegan2 feature extraction (e442021 )
support for no mask in B domain (6dd98d6 )
sv network as latest even with train_save_by_iter (9016d3f )
temp: norm default (d95d10f )
temporal criterion loss compute (4ddce8b )
temporal D and context pixels compatibility (421f458 )
temporal dataloader, with added path sanitize (6451db2 )
temporal dataloading and loss computing (07706d3 )
temporal discriminator with masks and bboxes (43015da )
temporal end of sequence (e08ffba )
temporal mse with GAN models (79b0d14 )
test functions names (2247f18 )
tests (fabd8b6 )
tests for all f_s net (144d7fd )
tests with no cache (787f27b )
training examples with mask_delta (9b14339 )
training examples with mask_delta (8d08839 )
typo (4274784 )
typo (f1dcff7 )
typo in README.md (f5ef15e )
typos (5b061f2 )
typos and class name (6462374 )
unaligned data_dataset_mode in README (5e7fbd9 )
underflow in custom cut l2norm, replaced by torch built-in (ff42852 )
UNet/UViT layers for cut NCE (8459876 )
unset index_B for unaligned mask offline datasets (8ba5e12 )
update python version for pre-commit (900ab17 )
Update README.md (cb9031d )
Update README.md (7e5fe82 )
use only mask from selected bbox at inference time (dd4fe3b )
use signal handler to kill all the processes (fab93ef )
using cut with F as pure sampling function (887b9a5 )
validation with no label, e.g. simple cut model (53984f4 )
visdom autostart and no display (fbc9942 )
visdom port for server launching (747d18a )
visuals for cyclegan (b7cabaa )
vitclip16 requires 224 input size (8b8bb67 )
wrong D_noise option (eaea415 )
wrong feature network in projected (3318c77 )
wrong indentation (bac6ce2 )
wrong number of model forward signature parameters check (868b55b )
wrong option for data sanitize (479242e )
wrong option name in test and add comment in json option loading (1a7abd6 )
wrong option prefix (711fa7d )
wrong options (4298071 )
wrong options for validation loading and fid computing (a9d2516 )
wrong train_compute_fid option (893c0e2 )
wrong values for bbox ref coordinates (dc2a00f )
2.0.0 (2023-10-06)
Features
add a server endpoint to delete files (30b2143 )
add choices for all options (ed43b82 )
add ddim inference (0196134 )
add DDPM tutorial on the VITON-HD dataset (c932d73 )
add FastAPI server to run training (f517462 )
add lambda for semantic losses (aab53fe )
add LPIPS metric (f1e0526 )
add miou compute to tests (c0033ef )
add new metrics (f3c84cd )
add palette model (b7db294 )
add psnr metric (7135458 )
add sampling options to test (a2958dc )
add SRC and hDCE losses (ddfcc97 )
add test for doc generation (41526f8 )
add test on cycle_gan_semantic_mask (3eeff76 )
add tests for reference image dataloaders (ae6405e )
added D noise to CUT with semantics (31aa4a3 )
added optimizers and options (505cac2 )
allow control of projected discriminator interpolation (dbffec5 )
allow ViT custom resolution at D projector init (82e6e83 )
api: display current commit at startup (6f90be8 )
aug: affine transforms for semantics (170b0f8 )
aug: configurable online mask delta augmentation by x and y axis (dfa6459 )
aug: select bbox category through the path sanitization functionality (a8d3f48 )
auto download segformer weights (083cc5e )
backward while computing MSE criterion loss (1b87906 )
bbox as sam prompt (a39c5bd )
bbox prompt for sam (1fa9cae )
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494 )
bw model export (8e43efa )
check code format when PR (eeb56cb )
choices for canny random thresholds (9573fc1 )
class weights for semantic segmentation network with cross entropy (4274f1e )
classifier training on domain B (fa343c0 )
commandline saving (6eb503e )
commandline script for joligan server calls (48ae23b )
compute_feats for unet G (9f1109e )
conditioning for palette (b9854ee )
config json for client script (174dce9 )
context for D (b0d3c7b )
contrastive classifier noise (7193e0e )
contrastive loss for D (deb2ec4 )
cut_semantic model (b20a943 )
D accuracy (26ead91 )
data: random noise in images for object insertion (42cf13d )
DDP (68f24da )
deceiving D for GAN training (2e2113f )
depth model as projector (10ffc28 )
depth prediction and depth discriminator (01bc62b )
diff augment (054509c )
diffusion inference with old and new models (9c4c5a9 )
display augmented images (2126253 )
display test images (a1de083 )
doc options auto update (1b08f92 )
doc: add JSON config examples (5332213 )
doc: basic server REST API (a757d17 )
doc: datasets (dfe2343 )
doc: DDPM conditioning training and inference examples (e694a29 )
doc: models (be1fe34 )
doc: refactored README with links to documentation (b5bf121 )
doc: reference image conditioning (70aeb32 )
doc: remove overview (2360527 )
doc: server, client, docker (68a5b96 )
doc: tips (3fea9ca )
doc: training (a4b720d )
doc: update inference models and examples (3c43a7b )
doc: updated FAQ (88b417c )
doc: updated model export (e692f78 )
edge detection techniques (78202ea )
export for unet_mha (b4c3cfd )
extract bbox from img (fb64ef0 )
first recut model (aaa4069 )
first test (4ac8cd9 )
fixed bbox size for online creation and bbox size randomization (5cd6227 )
G weights export during traing (4c045e6 )
generic image augmentation (d2ceb81 )
get_schema uses default instead of choices (not always available) (a779b2a )
global models (3819a40 )
inverted mask for automatic background inpainting (39f9ff2 )
itersize option for cut_model and cut_semantic_mask_model (b3b9e7a )
list all available models in help (06a6259 )
load segformer torchscript weights (672b341 )
loss values saved in json file and file to display it later (d0fa9a5 )
madgrad optimizer (1c410f2 )
metrics for testset (c875f2b )
miou compute for f_s pred (5851566 )
ml: ability to train in wavelet space, similar to 2102.06108 (28077d4 )
ml: added semantic threshold option (03f33a2 )
ml: adding MoNCE contrastive loss to CUT, 2203.09333 (26f2e9d )
ml: batchnorm to groupnorm in projected D and f_s unet to lower multi-gpu sync requirements (8efdf07 )
ml: classifier-free guidance for image-to-image DDPM (a6425ef )
ml: controle over image generation with diffusion in-painting model (0a5ed86 )
ml: DDIM scheduler (443e7d7 )
ml: diffusion super-resolution for palette (66c591f )
ml: diffusion-based augmentation for D, 2206.02262 (5cf1f1b )
ml: Discriminator based on SAM (e38d501 )
ml: dual U-Net blocks with cross reference for diffusion 2306.08276 (ffead4e )
ml: efficient UNet / 2205.11487 (d690ab4 )
ml: exponential moving average for G (ad33796 )
ml: GAN mask generator with sam refined target (0cd1ee9 )
ml: grayscale support for unaligned with semantics (a79548f )
ml: initial sam generated masks for unaligned datasets (5c64440 )
ml: ittr generator control over the number of blocks (ea7390e )
ml: ITTR transformer arch for image2image - 2203.16015 (16e2b5e )
ml: L1 loss for palette diffusion model (d2bb2a5 )
ml: Lion optimizer (4b99204 )
ml: mask conditioning for palette (8e71bb7 )
ml: mask generation across domains for GAN semantics (cf4890b )
ml: optim weigth decay parameter control (34fb2dd )
ml: query-selected attention for contrastive patches, 2203.08483 (6a13e44 )
ml: torchvision model support for semantic classification (341205c )
ml: UNet/UViT resnet block finer control (9bb979c )
ml: UViT for GAN (b1b3607 )
ml: ViT-14L as projected discriminator (dab27d2 )
mobile attention resnet (7b8fd87 )
MobileSam implementation (ea11745 )
MobileSam implementation (e0d3a67 )
more options for image generation with diffusion (cd08411 )
MSE identity loss for cut models (28bfbfd )
multi head attention unet generator (9e4c232 )
multimodal generator with cLR-GAN strategy (c3919b9 )
multiple D support (578f709 )
multiple linearly fitted D via vision aided ensembling, https://arxiv.org/abs/2112.09130 (9fbca44 )
multiscale loss for diff (0af1a80 )
nb of images used for FID computation can be chosen (233634b )
online dataset creation (9883eac )
option for number of inferences (palette model) (7e8bcc9 )
option to choose metrics to compute (ab1074a )
option to choose norm in unet_mha (77d0161 )
option to compute semantic G loss on f_s output (9ef922d )
option to display real image - fake image (b8a134a )
option to go through resnet blocks twice (backward compatibility) (1b3330b )
option to select embedding network for ref conditioning (2c9933d )
option to use a different f_s for domain B (6c55c26 )
options for aspect ratio and augmentation of cond image (7bf9c95 )
options.rst gen script (e13a9d8 )
precommit black formatting (ef0c883 )
pretrained segformer backbone (mit) as feature extractor for projected D (7566c8a )
previous frame as y cond in palette (99de095 )
projected discriminator (92c0aa0 )
quickstart (0aa50e4 )
quickstart update ddpm (98cef11 )
random perspective and bilinear interpolation with diffaug (d2633c8 )
readthedocs initial commit (2a57b6f )
reference image conditioning (8618129 )
rename cur_mask (53566c6 )
reorganizing display of images (bed57bd )
resnet G for diffusion (6d1cddc )
resnet_attn for cut model (7d01972 )
sam for mask refinement and edges (7e55165 )
sanitized paths can be saved and loaded (2f9f42f )
save latest images (1f5fb69 )
saving options json file (a9452ce )
script to compute metrics (f5ab623 )
script to generate diffusion video (574e149 )
script to remove useless models weights (543f280 )
scripts to export generator to JIT and to transform a single image (f3f5c37 )
second D for all models (7ec82be )
segformer can be used for attention masks generation (98a3d6e )
semantic loss on identity (91df963 )
server: add commit number in swagger documentation (3c96bc0 )
server: add synchronous training (5e3291e )
server: group options into categories (9358007 )
set mininum context via crop/bbox ratio (094b213 )
single bbox per crop for inpainting while training (ac2b4d6 )
sphinx rtd theme initial (6b2ac14 )
stylegan network from CUT repo (9e36c55 )
support for amp on forward ops (dffbb36 )
support TensorRT generator models in inference with DeepDetect (980ad49 )
sync loss only when printing (ff8de5e )
temporal criterion (c0dbf78 )
temporal discriminator (154da72 )
temp: reduce image size during test (7a7e4a3 )
tests with different f_s net (8d99c72 )
tf32 support (with Ampere GPUs) (c65622c )
timm models as feature extractor for projected D (498a11c )
use label 0 in domain B for object removal (d3e2399 )
use the trained discriminator to rank generated images (65d6416 )
use torchvision.transforms for differentiable augmentations (dc46733 )
UViT for diffusion (b427fa8 )
working inference tutorials (23c8ab6 )
Bug Fixes
1 iteration = 1 image (0b096f8 )
adam/radam beta1 default value (db0e2dd )
add clamping in tensor2im function (9537eac )
add einops to requirements_github_actions (b30895d )
add exec flag to run.sh (f6350d7 )
add missing architectures to options doc (5c31e1f )
add missing test file (9e520cd )
add mmcv to requirements for doc generation (2c8ef04 )
add mmseg to requirements for doc generation (9d83ffe )
add padding type to jit export (ae3e23c )
add phase to dataloaders arg (3992136 )
add reflectionpad to fix resnet (e702930 )
add tqdm to requirements (6e34f72 )
add type for nb_mask options (7bce61c )
add vision aided to doc auto requirements (5fbb13f )
add wget to github action requirements (ad7b3d7 )
all_classes_as_one option (df01988 )
allow for domain B to have no semantic labels (42604aa )
allow for image reading failures (d7f5d08 )
allow to freeze networks with DDP (d12bd7e )
alternate optimizing G and E (d76ba1b )
APA_img is only added to visual names APA option is true (309592e )
attention masks display (cc5b6b8 )
auto resize of attention masks (e3bf86a )
avoid lambda in RandomImgAug that prevents DDP (a206f91 )
B label clamping override + display colormap (6fc1dd4 )
batchnorm for single gpu (93b164c )
bbox and mask for diffusion generation script (390fe5d )
bbox ref idx from crop_image return in inference code (be0a1b5 )
broken export onnx script (8080036 )
broken test + bbox_ref_id option in diffusion single image generation script (179b3ca )
build docker file (c055385 )
catch image reading errors (0153eb4 )
circumvent isTrain in base options (2eb905e )
clean code (b83282b )
cls net was created with f_s options (70e9009 )
cls_class_weights default to [] (ad719f6 )
computation of out_mask loss when multigpu (31b7294 )
compute_temporal_fake defined twice (83e0d2d )
cond inference (77c0444 )
context pixel for sanitize paths (d18cee9 )
correct urls in quickstart_gan.rst (580353f )
cpu device for single image generation script (bb3c70c )
criterion temporal loss display (48d4c27 )
criterionIdt is not used by cut (3665410 )
crop_image for non square images and large bboxes (afdde75 )
cyclegan semantic mask failure with f_s (0688974 )
D global requires grad were not set true before backward (6a55914 )
D noise in cut base model (c6a2114 )
D_global optimization (284d8ae )
data dept init (8a0707e )
dataaug_D_diffusion option (70628d1 )
dataset created once, one dataloader per gpu (70348ff )
DDIM restoration when batch_size > 1 (ccb445b )
ddpm inference with sam masks + inverted masks (3055910 )
default --cls_config_segformer value (caadf7a )
default nb_mask_input value (632a0f6 )
dependencies to docker build file (76eb760 )
diff aug options (a337990 )
diffusion generation script is already multimodal (d49e66a )
diffusion inference with mask_in (dea27ac )
diffusion video generation fixes to generate API changes (2d74f7a )
diffusion with temporal data loader (c328acb )
display for palette model (2a98cf7 )
doc_gen for palette model (32ef0fa )
doc: badly formatted title (8b26c87 )
docker server source image (b29eb4f )
doc: missing losses (f74c373 )
doc: missing models (a9bddd4 )
doc: remove newlines and fix path (2f0aa5a )
doc: title format (f580d88 )
export & inference scripts for new encoder/decoder architectures (7993544 )
f_s backward when multigpu (f8b6067 )
f_s zero_grad in cut_semantic_mask (0e1c2b3 )
feats went through resnet blocks 2 times and rm old code (7d768b3 )
fid for cut and for relative path (57dbabf )
flush print() in joligan_api.py (0880d85 )
force RGB in online creation (15efa68 )
force torch to stable 1.13.1 (7fffaf0 )
gan_networks import for inference (09f2969 )
generate doc script (01f25af )
generation script for non multimodal models (a2ff20b )
generation scripts help (d1102ff )
get schema for * options (d7e33eb )
get schema should not compute base_gan_model (5dd6662 )
get_feats feature extractor for projected D (212751d )
guidance scale option for diffusion inference (846e4f5 )
help command (727fc15 )
help for diffusion inference script (d85fcc9 )
image generation script with torch model (4b67c05 )
image size control in generation script (01ad67d )
image size is unused in gen_single_image (b7ea7f4 )
img visu for palette models (668b253 )
imgaug option with no mask (f09d484 )
import signal module (e0d1c52 )
improved diffusion inference scripts, including video generation (485fad9 )
in dataloader, warning when class > nclasses (e9231eb )
in place gradient error with f_s (283cb85 )
inference beta_schedule location (ca910d8 )
inference when no mask cond (e19505d )
inference_num for reference img (e444ba1 )
input_B_label loading when available (83490b3 )
int are considered as float if needed and warning if remaining keys in options (ad6d99e )
inverted BtoA direction labels, moving direction to datasets (4ac2845 )
jenkins docker perms (ebad13f )
jit and onnx export scripts post refactor (32e0912 )
joligan api launch training (5ea68a4 )
lambda for GAN losses (020435a )
license year (2424de0 )
loading image with transparency (3bcc7fd )
loss values averaged accross gpus and compute last step (f941595 )
macs to flops (69a02db )
make aligned dataset work again (61741cc )
mask and class conditioning in generation script (38631b8 )
mask clamping and display for diff models (4b24fe4 )
mask display when more than one class (fa21b2e )
mask_delta checks (ef095f1 )
mask_delta for inference diffusion (69bf8ef )
mask_delta ratio online check (341dd73 )
missing get_weights for f_s (835ab4f )
missing help on projected D arch (c65f2d2 )
missing input_nc in f_s segformer constructor (309f75f )
missing shape attribute in conditional (9e3d365 )
ml: allow semantic loss weight control (8e552be )
ml: allow torchvision semantic model backbones to work with bw images (d112532 )
ml: control of cls semantic classifier learning rate (be566c6 )
ml: DDIM with reference image (f06e47c )
ml: diffusion schedule with generation script (4c26b09 )
ml: semantic regression loss tensor dims (1ddedc6 )
mobile netG option (4c3db79 )
modify command line for palette model (2287054 )
multigpu for cut model (ad95e67 )
multigpu for cut model failed when batch_size < nb_gpus (8d2fd1a )
multigpu for cut_semantic_mask model (4fe3ee2 )
multimodal GAN requires retain_graph for now (7be1628 )
nb attn variable after refactor (3e33fcc )
nce layers for ittr generator (916b475 )
NCE with segformer architecture (c037912 )
netF is initialized on the right gpu (7856bd9 )
netF weights updating (a75544e )
network loading (dcb25d7 )
networks groups inherit from other models (f988002 )
new options names for unaligned_labeled_mask (6fed55c )
no need for retain_graph to be True on losses (b715638 )
nvidia key rotation (eadd702 )
online creation and selfsupervised (42dd435 )
online creation for temporal dataloader (9cf3d0a )
online creation when cropsize > imgsize (c2fc34e )
online mask loading with multiple bboxes (1d7a6e1 )
online mask padding (90d7444 )
only load testset on gpu 0 (abcb4c4 )
only one gpu visible in each process (4954717 )
only onnx export fort segformer G (543ce28 )
onnx export (e9d36a8 )
option to not set device when parsing json config file (092b5f8 )
option type and max int value (d7f60f2 )
out mask images can be computed with more than 2 semantic classes (36d54b9 )
output_display_type default value should be a list (d694d29 )
palette inference without ref image (02d3f76 )
projection interpolation at init (2aecf46 )
proper categories titles in schema + improve option saving (bbc41c6 )
python html lib override (be2b089 )
random image aug class argument (2d8f686 )
random offset missing in crop_image (1fc4bf4 )
real image f_s pred name (d40109c )
remove environment.yml (783cae8 )
remove eval mode for export (9bdae8d )
remove last batch norm to cls for batch size 1 (d7fe727 )
remove use_resize from segformer forward (c71d4de )
requirements for github actions (6227c06 )
requirements for github actions (4fe34ce )
resnet attn class call (17f907f )
resnet default number of blocks with diffusion (3d34368 )
resnet with attention class name (950237d )
resnet_attn model name (eb4216e )
resnet_attn option (08b61ea )
reverse DDIM schedule (17ba96a )
round pixel gap before offset sampling (05e8959 )
rst autodoc (5309cc2 )
rst format and typos (f20f264 )
sam compatible with mask inversion in palette (4294679 )
sam inference as discriminator (5708832 )
sample runs and default options (f08d132 )
save generated bbox only when useful in diffusion inference (4b93cf3 )
save metrics plots and allow resuming (095f040 )
save model every epoch is default (51c5d34 )
save networks img for diffusion (090df33 )
saving paths_sanitized in checkpoints dir (49c5139 )
script for image gen using diffusion (e909525 )
segformer as f_s net (d55d63d )
segformer for semantics with optional partially pre-trained model (3078e51 )
segformer G and segformer feature extractor (85171e1 )
segformer ONNX export and image generation script (c5a700b )
selfsupervised dataloaders don't need domain B anymore (257a056 )
sem compute for cls on identity (6a87858 )
semantic losses was initialized twice (3fb85ab )
server launching in generate api doc (f459842 )
server: wrong import (cb4087e )
smaller batchsize at end of epochs (c49ddd1 )
smaller miou interval to trigger computing (76382d0 )
softmax G semantic was applied twice (5c4c3ec )
start super-resolution restoration from noise (3bce4a9 )
stylegan2 feature extraction (e442021 )
support for no mask in B domain (6dd98d6 )
sv network as latest even with train_save_by_iter (9016d3f )
temp: norm default (d95d10f )
temporal criterion loss compute (4ddce8b )
temporal D and context pixels compatibility (421f458 )
temporal dataloader, with added path sanitize (6451db2 )
temporal dataloading and loss computing (07706d3 )
temporal discriminator with masks and bboxes (43015da )
temporal end of sequence (e08ffba )
temporal mse with GAN models (79b0d14 )
test functions names (2247f18 )
tests (fabd8b6 )
tests for all f_s net (144d7fd )
tests with no cache (787f27b )
training examples with mask_delta (9b14339 )
training examples with mask_delta (8d08839 )
typo (4274784 )
typo (f1dcff7 )
typo in README.md (f5ef15e )
typos (5b061f2 )
typos and class name (6462374 )
unaligned data_dataset_mode in README (5e7fbd9 )
underflow in custom cut l2norm, replaced by torch built-in (ff42852 )
UNet/UViT layers for cut NCE (8459876 )
unset index_B for unaligned mask offline datasets (8ba5e12 )
update python version for pre-commit (900ab17 )
Update README.md (cb9031d )
Update README.md (7e5fe82 )
use only mask from selected bbox at inference time (dd4fe3b )
use signal handler to kill all the processes (fab93ef )
using cut with F as pure sampling function (887b9a5 )
validation with no label, e.g. simple cut model (53984f4 )
visdom autostart and no display (fbc9942 )
visdom port for server launching (747d18a )
visuals for cyclegan (b7cabaa )
vitclip16 requires 224 input size (8b8bb67 )
wrong D_noise option (eaea415 )
wrong feature network in projected (3318c77 )
wrong indentation (bac6ce2 )
wrong number of model forward signature parameters check (868b55b )
wrong option for data sanitize (479242e )
wrong option name in test and add comment in json option loading (1a7abd6 )
wrong option prefix (711fa7d )
wrong options (4298071 )
wrong options for validation loading and fid computing (a9d2516 )
wrong train_compute_fid option (893c0e2 )
wrong values for bbox ref coordinates (dc2a00f )
1.0.0 (2023-10-06)
Features
add a server endpoint to delete files (30b2143 )
add choices for all options (ed43b82 )
add ddim inference (0196134 )
add DDPM tutorial on the VITON-HD dataset (c932d73 )
add FastAPI server to run training (f517462 )
add lambda for semantic losses (aab53fe )
add LPIPS metric (f1e0526 )
add miou compute to tests (c0033ef )
add new metrics (f3c84cd )
add palette model (b7db294 )
add psnr metric (7135458 )
add sampling options to test (a2958dc )
add SRC and hDCE losses (ddfcc97 )
add test for doc generation (41526f8 )
add test on cycle_gan_semantic_mask (3eeff76 )
add tests for reference image dataloaders (ae6405e )
added D noise to CUT with semantics (31aa4a3 )
added optimizers and options (505cac2 )
allow control of projected discriminator interpolation (dbffec5 )
allow ViT custom resolution at D projector init (82e6e83 )
api: display current commit at startup (6f90be8 )
aug: affine transforms for semantics (170b0f8 )
aug: configurable online mask delta augmentation by x and y axis (dfa6459 )
aug: select bbox category through the path sanitization functionality (a8d3f48 )
auto download segformer weights (083cc5e )
backward while computing MSE criterion loss (1b87906 )
bbox as sam prompt (a39c5bd )
bbox prompt for sam (1fa9cae )
bilinear interpolation of attention heads when dimension does not match, useful for segformer G (eed9494 )
bw model export (8e43efa )
check code format when PR (eeb56cb )
choices for canny random thresholds (9573fc1 )
class weights for semantic segmentation network with cross entropy (4274f1e )
classifier training on domain B (fa343c0 )
commandline saving (6eb503e )
commandline script for joligan server calls (48ae23b )
compute_feats for unet G (9f1109e )
conditioning for palette (b9854ee )
config json for client script (174dce9 )
context for D (b0d3c7b )
contrastive classifier noise (7193e0e )
contrastive loss for D (deb2ec4 )
cut_semantic model (b20a943 )
D accuracy (26ead91 )
data: random noise in images for object insertion (42cf13d )
DDP (68f24da )
deceiving D for GAN training (2e2113f )
depth model as projector (10ffc28 )
depth prediction and depth discriminator (01bc62b )
You can’t perform that action at this time.