-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with offline mode #4760
Comments
Hi @SaulLu, thanks for reporting. I think offline mode is not supported for datasets containing only data files (without any loading script). I'm having a look into this... |
Thanks for your feedback! To give you a little more info, if you don't set the offline mode flag, the script will load the cache. I first noticed this behavior with the |
This is an issue we have to fix. |
This is related to #3547 |
Still not fixed? ...... |
#5331 will be helpful to fix this, as it updates the cache directory template to be aligned with the other datasets |
Any updates ? |
I'm facing the same problem |
This issue has been fixed in You just have to update
|
I'm on version 2.17.0, and this exact problem is still persisting. |
Can you share some code to reproduce your issue ? Also make sure your cache was populated with recent versions of |
I'm not sure if this is related @lhoestq but I am experiencing a similar issue when using offline mode: $ python -c "from datasets import load_dataset; load_dataset('openai_humaneval', split='test')"
$ HF_DATASETS_OFFLINE=1 python -c "from datasets import load_dataset; load_dataset('openai_humaneval', split='test')"
Using the latest cached version of the dataset since openai_humaneval couldn't be found on the Hugging Face Hub (offline mode is enabled).
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/dodrio/scratch/projects/2023_071/alignment-handbook/.venv/lib/python3.10/site-packages/datasets/load.py", line 2556, in load_dataset
builder_instance = load_dataset_builder(
File "/dodrio/scratch/projects/2023_071/alignment-handbook/.venv/lib/python3.10/site-packages/datasets/load.py", line 2265, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File "/dodrio/scratch/projects/2023_071/alignment-handbook/.venv/lib/python3.10/site-packages/datasets/packaged_modules/cache/cache.py", line 122, in __init__
config_name, version, hash = _find_hash_in_cache(
File "/dodrio/scratch/projects/2023_071/alignment-handbook/.venv/lib/python3.10/site-packages/datasets/packaged_modules/cache/cache.py", line 48, in _find_hash_in_cache
raise ValueError(
ValueError: Couldn't find cache for openai_humaneval for config 'default'
Available configs in the cache: ['openai_humaneval'] |
Thanks for reporting @BramVanroy, I managed to reproduce and I opened a fix here: #6741 |
Awesome, thanks for the quick fix @lhoestq! Looking forward to update my dependency version list. |
Thanks a lot! I have faced the same problem. Can I use your fix code to directly replace the existing version code? I noticed that this fix has not been merged yet. Will it affect other functionalities? |
I just merged the fix, you can install |
Describe the bug
I can't retrieve a cached dataset with offline mode enabled
Steps to reproduce the bug
To reproduce my issue, first, you'll need to run a script that will cache the dataset
then, you can try to reload it in offline mode:
Expected results
I would have expected the 2nd snippet not to return any errors
Actual results
The 2nd snippet returns:
Environment info
datasets
version: 2.4.0Maybe I'm misunderstanding something in the use of the offline mode (see doc), is that the case?
The text was updated successfully, but these errors were encountered: