-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to download CelebA dataset using download=True #1920
Comments
The error message
means, that the traffic of this file (size and number of downloads) exceeds a limit or quota set by Google Drive. Since we are not hosting the dataset we have no chance to help you with this, since this is not an error on our side. According to the answer in the above link this quota is reset every 24 hours, so a possible fix for you might be to try again later and hope that the traffic limit is not reached yet. |
Thanks @pmeier for the help! It looks like there is not much we can do, please try again in some time and let us now if the problem persists. As such, I'm closing this issue |
it has been nearly a year on this issue and the error still pops up @pmeier |
@MohamedAliRashad What do you mean by
? There is no way for us to get around this error, since we are not hosting the dataset. See my previous comment #1920 (comment) for details. |
@pmeier |
Of course they can, but this is not for us to decide. If you think there is a better hosting solution you need to get in contact with the authors. Note our disclaimer at the bottom of our README:
|
Can I just point out a workaround that worked for me rather trying my luck every 24 hours. The needed files for celeba dataset, as defined in the filelist in torchvision's CelebA class, are as follows: img_align_celeba.zip, list_attr_celeba.txt, identity_CelebA.txt, list_bbox_celeba.txt, list_landmarks_align_celeba.txt, list_eval_partition.txt I downloaded them directly from the authors' google drive link here, and placed them in the path: {root}/celeba where root is the directory you specify when calling the CelebA class |
@MohanadOdema |
@MohanadOdema we should be doing exactly the same thing within the download functionality albeit automatically. I can confirm that I get different links when doing this manually. I'll investigate. |
Can we reopen this? I just ran into this issue again. I was so happy to have this super simple solution and so disappointed when i ran into this issue :) |
Friendly ping to one of the authors @liuziwei7, just to make you aware: |
This was fixed as good as we can in #4109. Starting from the next release we now bail out early if the download failed instead of simply putting the failure message in the file.
Since AFAIK CelebA is the only dataset hosted on Baidu Cloud and the problem can be solved by waiting and trying again it currently has no priority. We would accept a PR though if you or someone else wants to add the functionality.
This has nothing to do with PyTorch, but with the dataset hosted on Google Drive. Each file has a daily quota on there. Ff it is met, i.e. the file was downloaded X times for this day, Google drive simple refuses the download if you try again. |
Thank you, it works for me. |
I'm trying to do this currently to no avail. Do you know if this is still a functional workaround? |
Hey @cooperflourens , Try manually downloading from the google drive link, you need to login into Google for this. For more information please see the discussions in #5704 and #6052 . |
Hey @abhi-glitchhg , Thanks for your reply. I downloaded those files and set download=True and it worked. I think my problem before was that I had download set to false. Thank you for your help! |
This worked for me too. Thank you! |
Thanks for the workaround. Not sure if the code's change recently, but fwiw, I also had to unzip img_align_celeba.zip into the celeba/ directory to get it working. |
🐛 Bug
It fails to download the following files
Rather than the zip file, it downloads a html file "Google Drive - Quota exceeded". Returns badZipFile error
Similarly, "Google Drive - Quota exceeded". This time it returns RuntimeError('Dataset not found or corrupted.' + ' You can use download=True to download it')
Similar to number 2
To Reproduce
Steps to reproduce the behavior:
train_dataset = datasets.CelebA('data', split="train", transform=transforms.ToTensor(), download=True)
Expected behavior
Environment
PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0
OS: Microsoft Windows 10 Home Single Language
GCC version: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0
CMake version: Could not collect
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.17.0
[pip3] torch==1.2.0
[pip3] torchtext==0.4.0
[pip3] torchvision==0.4.0
[conda] Could not collect
Additional context
The text was updated successfully, but these errors were encountered: