-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Feature - Expose ZLib crc32() function #7213
Comments
Hello! There is only one reason, it could be implemented in user-land without any pain: https://www.npmjs.org/search?q=crc32 . And you get often updates and choice for free! :) Cheers, P.S. Generally this is our policy for the most of proposed features: if they could be implemented in user-land - avoid putting them in core. |
I need to calculate CRC's on very large files. I have not tested yet but I would guess that the implementing CRC32 in JavaScript would be significantly slower than native C code. I will set up some tests but I have not yet found any userland CRC modules designed to handle streams so I have a bit of hacking to do. |
After a partial re-write of the crc32 package, I was able to do some good testing with file streams. I tested with a large file of approximately 750MB and crc32 algorithm in JavaScript took about 10 seconds on my laptop. rhash crc32 calculation took about 2.5 seconds. This performance delta is completely expected. The big question is why not expose the zlib.crc32() function to enable a native crc32 function call to allow for native file checksum processing that in a high performance and cross platform manner? |
Frankly, if zlib offers it, I see no reason to not expose a binding. |
@TooTallNate it has not the fastest implementation of it. @JavaScriptDude could you please try benchmarking https://www.npmjs.org/package/sse4_crc32 or https://www.npmjs.org/package/crc32c ? |
These two packages are addon-containing, i.e.
You can't avoid putting zlib's crc32 implementation in the core. It's already there. The only remaining work is exposing it. If documenting and supporting and testing zlib's crc32 implementation seems tedious, make the exposed implementation unofficial and non-documented (such as |
Then I could say only that Pull Requests for this feature would be welcome :) |
Cool. I may take a stab at this. Not sure of the standard procedures for your issue tracking but should this ticket be re-opened so others can see it as an open item? |
Done! |
@JavaScriptDude any progress on this? If not, I could take a stab at this myself, since it's relevant to my interests. |
@seishun Go for it! I hope it's not rude for me to point out that the calendar on the page of @JavaScriptDude's GitHub profile does not display any activity in this March, and that's a good reason to assume that the answer (about the progress) is negative (unless something was developed in secret and outside of GitHub). |
@seishun please proceed. I did start some initial design work offline but got stumped with direction and learning curve with C and was then distracted by other higher priority initiatives. Below are some of my thoughts on this. The challenge is to design it so that it is both fast and useful. To be useful, it should allow for generating a CRC from a stream from JavaScript side. To do this, we would need to use zlib's crc32.c::crc32() and ideally store the running checksum in C++ side rather than marshaling the running checksum back and forth from JavaScript side. The JavaScript side would only be pushing buffers of data through to crc32.c::crc32() from a stream on the JavaScript (userland) side. There are use cases where I would like it to act against a file directly and it would be interesting to see how much faster it would be to implement a simple wrapper for minizip.c::getFileCrc(). This should be the fastest way to go and could be a good litmus test to see how much extra compute time is taken by the JavaScript and marshaling side of Node.js and expose opportunities for optimization in the stream based approach. |
@seishun Was just following your thread and commits. Thanks for your work on this :) |
For the record, I have added a new issue to open a discussion to Incorporate suite of hash utilities into node #2444 |
I have a need to do cross platform simple file validation using Node.js. I am presently building a backup file distribution and synchronization system using Node.js and need a solid and fast way of generating a check sum of the files on the sender and receiver sides.
The files are on servers I own so I don't need crypto hashes for my purpose.
I know that zlib has a crc32 function under the hood and it would be great to have this available in JavaScript.
I know that others have written crc32 implementations in JavaScript for Node but IMO, that is not the correct approach.
Any reason why this should not be implemented?
The text was updated successfully, but these errors were encountered: