Skip to content

2.7.0

Compare
Choose a tag to compare
@albertvillanova albertvillanova released this 16 Nov 10:11
· 756 commits to main since this release
edf1902

Dataset Features

  • Multiprocessed dataset builder by @TevenLeScao in #5107
    • Load big datasets faster than before using multiprocessing:
    from datasets import load_dataset
    ds = load_dataset("imagenet-1k", num_proc=4)
  • Make torch.Tensor and spacy models cacheable by @mariosasko in #5191
    • Function passed to map or filter that uses tensors or pipelines can now be cached
  • Drop labels in Image and Audio folders if files are on different levels in directory or if there is only one label by @polinaeterna in #5192
  • TextConfig: added "errors" by @NightMachinery in #5155

Audio setup

Docs

General improvements and bug fixes

New Contributors

Full Changelog: 2.6.1...2.7.0