Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Extraction of MNIST\raw\train-images-idx3-ubyte not supported #3554

Closed
Wemmons831 opened this issue Mar 11, 2021 · 6 comments
Closed

Comments

@Wemmons831
Copy link

Hey, so I am trying to use the MNIST dataset by doing
train = datasets.MNIST("", train =True, download = True, transform = transforms.Compose([transforms.ToTensor()])) test = datasets.MNIST("", train =False, download = True, transform = transforms.Compose([transforms.ToTensor()]))
however, I got a 503 error because the link for the download is down. I know you guys have no control over this so I downloaded the files off of another website. I know there is probably a different way to do this but I couldn't find the file path to put them in so I just hosted the files on a flask website and changed the URL in mnist.py to the URL I was hosting the files on. This worked fine and it downloaded all the files but now when I run it I get the following error
ValueError: Extraction of MNIST\raw\train-images-idx3-ubyte not supported
And I tried unzipping the files and placing them in a bunch of different folders but that didn't seem to do anything so I was wonder if anyone knew how to fix this

@Wemmons831 Wemmons831 changed the title error with MNIST ValueError: Extraction of MNIST\raw\train-images-idx3-ubyte not supported Mar 11, 2021
@albanD albanD transferred this issue from pytorch/pytorch Mar 12, 2021
@CanIyu
Copy link

CanIyu commented Mar 12, 2021

You can solve this below command !
But It took to download in about 10 min.

train_val = datasets.MNIST('./', train=True, download=True, transform=transform)
test = datasets.MNIST('./', train=False, download=True, transform=transform)

@Wemmons831
Copy link
Author

Wemmons831 commented Mar 12, 2021

@Canlyu I still get the same error could it be that I got the wrong files cause I am just using the latest ones from the website with Wayback machine

@Wemmons831
Copy link
Author

It is trying to extract a file that has already been extracted from the gzip

@ptrblck
Copy link
Contributor

ptrblck commented Mar 12, 2021

Cross-posting:
It seems the MNIST server might be down (or is dying often):
https://discuss.pytorch.org/t/mnist-server-down/114433

http://yann.lecun.com/exdb/mnist/ shows Service Unavailable randomly.

@pmeier
Copy link
Contributor

pmeier commented Mar 12, 2021

I know you guys have no control over this so

We got permission to host the dataset ourselves and provide the download in case the original server breaks again. See #3544 for details.

ValueError: Extraction of MNIST\raw\train-images-idx3-ubyte not supported

This happens, because right now we are using datasets.utils.extract_archive to also decompress files. Since you already have extracted the file it doesn't know what to do with it. I'm working on a proper fix for this in #3443 as we speak.

For a temporary workaround you can do something like this to compress the files again.

import gzip

for file in files:
    with open(file, "rb") as rfh, gzip.open(f"{file}.gz", "wb") as wfh:
        wfh.write(rfh.read())

@Avs163
Copy link

Avs163 commented May 11, 2021

You can solve this below command !
But It took to download in about 10 min.

train_val = datasets.MNIST('./', train=True, download=True, transform=transform)
test = datasets.MNIST('./', train=False, download=True, transform=transform)

Thanks man, this is working fine :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants