-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I found a tar.gz that broke ratarmount after 16 hours ;-; #66
Comments
Thank you for reporting this! And sorry about your wasted computation power. I was not aware that there is a maximum blob size even though I did some quite large benchmarks (100GB tar.gz files). The limit seems to be around 1GB for the blob size. I might be able to reproduce this problem by myself by choosing smaller Conversely, you might be able to work around your issue for now by increasing the gzip seek point distance, which reduces the data to write out. But, it might slow down random seeking a little bit. One seek point takes up roughly 32kiB. The default spacing should be 16MiB, i.e., the index required for gzip seeking is ~0.2% of your original files. As you hit the 1GB index limit, this means your file must be larger than 512GB. According to your output log, your file seems to be about 155GiB. Is that correct? I guess, there is some leeyway in my calculations somewhere. If you want to give it another try, then please try:
This will increase the gzip spacing to 128MiB and ratarmount should work if your tar.gz is smaller than 1GB/32kiB*128MiB = 4 TiB. Well, or with the ~4x deviation from my estimates, it should work with files smaller than ~1TB. If your archive is even larger or close to it then please choose an appropriately higher seek point spacing with some leeway because I'm not 100% sure about the 1GB and the 32kiB is only a rough estimate. That SQLite database is essential for ratarmount to work even if it isn't written to disk. However, the gzip seek points are not essential when not writing it to disk. I might be able to avoid dumping them into the database if that database is never to be written to disk anyway. Then you would have been able to use The limit also can't be increased much more, only to the 2GB limit (signed 32 bit max number). I guess, I'll have to split the data into multiple smaller blobs to avoid the the limit. |
The archive itself is 1.51TB, so you're slightly off
This is almost certainly the best choice |
I must have forgotten a 0 somewhere. Well, Sorry about editing your post. I wanted to quote it not edit it... |
I pushed a fix. You can try it out with: pip install git+https://github.com/mxmlnkn/ratarmount.git@fix-gzindex-max-blob-size#egginfo=ratarmount |
I'll be without internet for a few weeks, so hopefully I can try this when I get internet again |
Hopefully fixed in 0.8.1. Please let me know if it also works for you and if not feel free to reopen this issue. |
fusepy/fusepy #66, #67, #101 fusepy/fusepy #100 First test with ratarmount worked! - [ ] I only monkey-patched readdir and getattr. The other changed methods should also be adjusted and tested and maybe we can do better, e.g., by letting the caller decide which interface they want to implement with a member variable as flag! Or do it via inspection like in fusepy/fusepy#101, but the overhead might be killer.
Currently at position 1664643022848 of 1664666965947 (100.00%). Estimated time remaining with current rate: 0 min 1 s, with average rate: 0 min 0 s.
Creating offset dictionary for /home/download/thefile.tar.gz took 57789.62s
Traceback (most recent call last):
File "/usr/local/bin/ratarmount", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.8/dist-packages/ratarmount.py", line 2604, in cli
fuseOperationsObject = TarMount(
File "/usr/local/lib/python3.8/dist-packages/ratarmount.py", line 1937, in init
self.mountSources: List[Union[SQLiteIndexedTar, FolderMountSource]] = [
File "/usr/local/lib/python3.8/dist-packages/ratarmount.py", line 1938, in
SQLiteIndexedTar(tarFile, writeIndex=True, **sqliteIndexedTarOptions)
File "/usr/local/lib/python3.8/dist-packages/ratarmount.py", line 504, in init
self._loadOrStoreCompressionOffsets() # store
File "/usr/local/lib/python3.8/dist-packages/ratarmount.py", line 1622, in _loadOrStoreCompressionOffsets
db.execute('INSERT INTO gzipindex VALUES (?)', (file.read(),))
OverflowError: BLOB longer than INT_MAX bytes
The text was updated successfully, but these errors were encountered: