Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use quickxorhash module if available #5

Closed
wants to merge 1 commit into from

Conversation

wienand
Copy link

@wienand wienand commented Apr 11, 2024

If available use quickxorhash module instead of subprocess call to quickxorhash executable.

Install module with

pip install quickxorhash

…ickxorhash executable.

Install module with

    pip install quickxorhash

(cherry picked from commit c403d39)
@wienand wienand closed this Apr 11, 2024
@wienand wienand deleted the quickxorhash-module branch April 11, 2024 11:27
@wienand wienand restored the quickxorhash-module branch April 11, 2024 11:28
@wienand wienand deleted the quickxorhash-module branch April 11, 2024 11:28
@wienand wienand restored the quickxorhash-module branch April 11, 2024 11:28
@wienand wienand reopened this Apr 11, 2024
@incognito1234
Copy link
Owner

incognito1234 commented Jul 5, 2024

Hi Wienand,

Thanks for this patch. Unfortunately, it does not work with large files. Here is the result of a test with a 5 Gb file:

>>> from lib.check_helper import *
>>> q=quickxorhash()
>>> q.quickxorhash("../largefile.dat")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pi/src/odc/lib/check_helper.py", line 25, in quickxorhash
    h.update(open(filename, 'rb').read())
OverflowError: unbounded read returned more bytes than a Python bytes object can hold

@wienand
Copy link
Author

wienand commented Jul 6, 2024

You could read the file in chunks and call the method update for every chunk:

      h = qxh.quickxorhash()
      h.update(open(filename, 'rb').read())
      return base64.b64encode(h.digest()).decode('utf8')

e.g.

      chunksize = 1024 * 1024 * 5
      h = qxh.quickxorhash()
      with open(filename, 'rb') as datafile:
          while True:
              chunk = datafile.read(chunksize)
              if chunk == b'':
                  break
              h.update(chunk)
      return base64.b64encode(h.digest()).decode('utf8')

Let me know if I should update the merge requests or create a new one.

Anyway, thanks to the library. I transferred all photos (~10 GB) from Amazon Photos to Onedrive with it.

@incognito1234
Copy link
Owner

I added the patch to the last release.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants