You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the latest version of datasets (1.14.0), I cannot use the bookcorpusopen dataset. The process blocks always around 9924 examples [00:06, 1439.61 examples/s] when preparing the dataset. I also noticed that after half an hour the process is automatically killed because of the RAM usage (the machine has 1TB of RAM...).
This did not happen with 1.4.1.
I tried also rm -rf ~/.cache/huggingface but did not help.
Changing python version between 3.7, 3.8 and 3.9 did not help too.
I tried with the latest changes from #3280 on google colab and it worked fine :)
We'll do a new release soon, in the meantime you can use the updated version with:
Describe the bug
When using the latest version of datasets (1.14.0), I cannot use the
bookcorpusopen
dataset. The process blocks always around9924 examples [00:06, 1439.61 examples/s]
when preparing the dataset. I also noticed that after half an hour the process is automatically killed because of the RAM usage (the machine has 1TB of RAM...).This did not happen with 1.4.1.
I tried also
rm -rf ~/.cache/huggingface
but did not help.Changing python version between 3.7, 3.8 and 3.9 did not help too.
Steps to reproduce the bug
Expected results
A clear and concise description of the expected results.
Actual results
Specify the actual results or traceback.
Environment info
datasets
version: 1.14.0The text was updated successfully, but these errors were encountered: