-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash images and save resources #566
Comments
Maybe I should mention why I chose some of my words / clarify a bit:
Regarding flags, as of now, I already thought of these:
Maybe there could also be a "TTL" defined by either the Oxipng Version or Age. Maybe the user doesn't want to keep hashes from more than 2 years ago or just wants to refresh the results after a given time since algorithms may have improved. Everything will depend on the implementation, though. I'm just sharing my thoughts. That said - I have little knowledge about how PNGs work behind the scenes, so I don't know if the technical aspects actually work. I'm just pretty certain it would work if we hash the entire files and everything else is assuming PNGs work the way I think they do. |
Hi @Maingron Just so I understand, the main distinction between what you're proposing and the discussions in #549 is that this will effectively handle duplicates of the same image, is that right? It seems extremely similar, so I would prefer if you could add your ideas to the existing issue and I can close this as a duplicate. |
@andrews05 I don't think its too similar. Best example I can think of at the moment: If we have a folder with 10000 images and the process gets interrupted at image 9500, we would start again at image 1. #549 would start at image 5901. So basically we index everything for future reference while #549 just doesn't want to touch anything twice. Another difference is that we will notice if anything about an imaged changed while #549 doesn't since it already worked on it so it doesn't bother. #549 also would only care about the very path and not about the actual data |
I'm trying to understand this at a conceptual level rather than an implementation level. Can you describe at a high level what you're actually wanting to achieve? What is your use case? |
@andrews05 Well, let's assume I'm some non-technical user and make up some story: It would be cool if Oxipng worked much faster. Oftentimes I have the same image within multiple projects I manage and it really can hinder my workflow when I have to wait for the same image to process within multiple directories. It should be possible for Oxipng to remember that it already processed this image, just in another directory and automatically copy it. Also, once a week, I get some data to manage, almost like a backup, which I always run through Oxipng before doing anything else with the data. Usually the data doesn't change too much, so it would go so much faster if Oxipng remembered the work from last week. I can't manually copy the processed images from last week over however, to avoid copying data that has actually changed. Also there's just way too many sub-directories. |
Right, thanks for the user story! Despite serving slightly different use cases, I still think they're conceptually related and it would be great if we could collect all these ideas together in the same topic. |
Closing in favour of #549 |
What about hashing images before / after compressing and comparing them to a local table of already compressed images, maybe even just from the same run? This should be a cheap way to occasionally save some resources. Also, this would mean that if we compressed the same image with a heavier algorithm beforehand, we can in the future get the same compression when running a light algorithm (Just by copying and verifying the older compressed image).
Said table could be stored in the OS's temp folder. If we have a match, we should also verify it's actually the same image and not just a hash collision.
In my opinion, for this, oxipng should also add a flag to disable the cache.
I'm partially referring to #549
While here already: Thank you for this great tool!
The text was updated successfully, but these errors were encountered: