Skip to content

Latest commit

 

History

History
248 lines (195 loc) · 19.4 KB

readme.md

File metadata and controls

248 lines (195 loc) · 19.4 KB


Join our Discord

neosr is an open-source framework for training super-resolution models. It provides a comprehensive and reproducible environment for achieving state-of-the-art image restoration results, making it suitable for both the enthusiastic community, professionals and machine learning academic researchers. It serves as a versatile platform and aims to bridge the gap between practical application and academic research in the field.

  • Accessible: implements a wide range of the latest advancements in single-image super-resolution networks, losses, optimizers and augmentations. Users can easily explore, adapt and experiment with various configurations for their specific needs, even without coding skills.

  • Efficient: optimized for faster training iterations, quicker convergence and low GPU requirements, making it the most efficient choice for both research and practical use cases.

  • Practical: focuses on the real-world use of super-resolution to realistically restore degraded images in various domains, including photos, anime/cartoons, illustrations and more. It's also suitable for critical applications like medical imaging, forensics, geospatial and others (although caution should be taken in those cases).

  • Reproducible: this framework emphasizes the importance of reproducible research. It provides deterministic training environments that can create bit-exact reproducible models (on the same platform), ensuring predictable and reliable results, which are essential for maintaining consistency in academic validation.

  • Simple: features are easy to implement or modify. Code is written in readable Python, no fancy styling. All code is validated and formatted by ruff, mypy and torchfix.

For more information see our wiki.

🤝 support the project

Tip

Consider supporting the project on KoFi ☕ or Patreon

💻 installation

Requires Python 3.12 and CUDA >=12.4. Clone the repository and install via poetry:

git clone https://github.com/neosr-project/neosr
cd neosr
poetry install --sync

See detailed Installation Instructions for more details.

⏩ quick start

Start training by running:

python train.py -opt options.toml

Where options.toml is a configuration file. Templates can be found in options.

Tip

Please read the wiki Configuration Walkthrough for an explanation of each option.

✨ features

arch option
Real-ESRGAN esrgan
SRVGGNetCompact compact
SwinIR swinir_small, swinir_medium
HAT hat_s, hat_m, hat_l
OmniSR omnisr
SRFormer srformer_light, srformer_medium
DAT dat_small, dat_medium, dat_2
DITN ditn
DCTLSA dctlsa
SPAN span
Real-CUGAN cugan
CRAFT craft
SAFMN safmn, safmn_l
RGT rgt, rgt_s
ATD atd, atd_light
PLKSR plksr, plksr_tiny
RealPLKSR realplksr, realplksr_s
DRCT drct, drct_l, drct_s
MSDAN msdan
SPANPlus spanplus, spanplus_sts, spanplus_s, spanplus_st
HiT-SRF hit_srf, hit_srf_medium, hit_srf_large
HMA hma, hma_medium, hma_large
MAN man, man_tiny, man_light
light-SAFMN++ light_safmnpp
MoSR mosr, mosr_t
GRFormer grformer, grformer_medium, grformer_large
EIMN eimn, eimn_a, eimn_l
LMLT lmlt, lmlt_tiny, lmlt_large
DCT dct
KRGN krgn
PlainUSR plainusr, plainusr_ultra, plainusr_large
HASN hasn
FlexNet flexnet, metaflexnet
CFSR cfsr
Sebica sebica, sebica_mini

Note

For all arch-specific parameters, read the wiki.

under testing

arch option
NinaSR ninasr, ninasr_b0, ninasr_b2
net option
U-Net w/ SN unet
PatchGAN w/ SN patchgan
EA2FPN (bespoke, based on A2-FPN) ea2fpn
DUnet dunet
MetaGan metagan
optimizer option
Adam Adam or adam
AdamW AdamW or adamw
NAdam NAdam or nadam
Adan Adan or adan
AdamW Win2 AdamW_Win or adamw_win
ECO strategy eco, eco_iters
AdamW Schedule-Free adamw_sf
Adan Schedule-Free adan_sf
F-SAM fsam, FSAM
SOAP soap
loss option
L1 Loss L1Loss, l1_loss
L2 Loss MSELoss, mse_loss
Huber Loss HuberLoss, huber_loss
CHC (Clipped Huber with Cosine Similarity Loss) chc_loss
NCC (Normalized Cross-Correlation) ncc_opt, ncc_loss
Perceptual Loss perceptual_opt, vgg_perceptual_loss
GAN gan_opt, gan_loss
MS-SSIM mssim_opt mssim_loss
LDL Loss ldl_opt, ldl_loss
Focal Frequency ff_opt, ff_loss
DISTS dists_opt, dists_loss
Wavelet Guided wavelet_guided
Perceptual Patch Loss perceptual_opt, patchloss, ipk
Consistency Loss (Oklab and CIE L*) consistency_opt, consistency_loss
KL Divergence kl_opt, kl_loss
MS-SWD msswd_opt, msswd_loss
FDL fdl_opt, fdl_loss
augmentation option
Rotation use_rot
Flip use_hflip
MixUp mixup
CutMix cutmix
ResizeMix resizemix
CutBlur cutblur
model description option
Image Base model for SISR, supports both Generator and Discriminator image
OTF Builds on top of image, adding Real-ESRGAN on-the-fly degradations otf
loader option
Paired datasets paired
Single datasets (for inference, no GT required) single
Real-ESRGAN on-the-fly degradation otf

📸 datasets

As part of neosr, I have released a dataset series called Nomos. The purpose of these datasets is to distill only the best images from the academic and community datasets. A total of 14 datasets were manually reviewed and processed, including: Adobe-MIT-5k, RAISE, LSDIR, LIU4k-v2, KONIQ-10k, Nikon LL RAW, DIV8k, FFHQ, Flickr2k, ModernAnimation1080_v2, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash.

  • Nomos-v2 (recommended): contains 6000 images, multipurpose. Data distribution:
pie
  title Nomos-v2 distribution
  "Animal / fur" : 439
  "Interiors" : 280
  "Exteriors / misc" : 696
  "Architecture / geometric" : 1470
  "Drawing / painting / anime" : 1076
  "Humans" : 598
  "Mountain / Rocks" : 317
  "Text" : 102
  "Textures" : 439
  "Vegetation" : 574
Loading
  • nomos_uni (recommended for lightweight networks): contains 2989 images, multipurpose. Meant to be used on lightweight networks (<800k parameters).
  • hfa2k: contains 2568 anime images.
dataset download sha256
nomosv2 (3GB) sha256
nomosv2.lmdb (3GB) sha256
nomosv2_lq_4x (187MB) sha256
nomosv2_lq_4x.lmdb (187MB) sha256
nomos_uni (1.3GB) sha256
nomos_uni_lq_4x sha256
hfa2k sha256

community datasets

Datasets made by the upscaling community. More info can be found in author's repository.

  • DF2k-BHI: a curated version of the classic DF2k dataset, made by @Phhofm. Read more about it here.
  • 4xNomosRealWeb Dataset: realistically degraded LQ's for Nomos-v2 dataset (from @Phhofm).
  • FaceUp: Curated version of FFHQ
  • SSDIR: Curated version of LSDIR.
  • ArtFaces: Curated version of MetFaces.
  • Nature Dataset: Curated version of iNaturalist.
  • digital_art_v2: Digital art dataset from @umzi2.
dataset download
@Phhofm HuggingFace
@Phhofm 4xNomosRealWeb Release page
@Phhofm FaceUp GDrive (4GB)
@Phhofm SSDIR Gdrive (4.5GB)
@Phhofm ArtFaces Release page
@Phhofm Nature Dataset Release page
@umzi2 Digital Art (v2) Release page

📖 resources

📄 license and acknowledgements

Released under the Apache license. All licenses listed on license/readme. This code was originally based on BasicSR.

Thanks to victorca25/traiNNer, styler00dollar/Colab-traiNNer and timm for providing helpful insights into some problems.

Thanks to active contributors @Phhofm, @Sirosky, and @umzi2 for helping with tests and bug reporting.