Load various datasets in Python
This repo contains pydatset, a package for loading (and eventually augmenting) datasets in python, you can check the source of each function into their pydocs. Currently supported datasets are:
Pull requests are welcome!!!
- numpy
- scipy (used to load Tiny-ImageNet)
- cv2 (used to load GTSRB)
- scikit-image (used for data augmentation)
Get pip and run:
pip install git+git://github.com/dnlcrl/PyDatSet.git
Download the required dataset (e.g. cifar10 ) and call the respective load(path)
function, for example:
$ python
>>> from pydatset import cifar10
>>> data = cifar10.load('path/to/cifaf10')
Apply data augmentation to a given batch by doing something like:
>>> from pydatset.data_augmentation import (random_contrast, random_flips,
... random_crops, random_rotate,
... random_tint)
>>> batch = random_tint(batch)
>>> batch = random_contrast(batch)
>>> batch = random_flips(batch)
>>> batch = random_rotate(batch, 10)
>>> batch = random_crops(batch, (32, 32), pad=2)
check pydatset/README.md for more infos about the package contents.