-
Notifications
You must be signed in to change notification settings - Fork 1
Rabin fingerprinting command line tool. Useful for compression, storing minimal information for multiple revisions of a file, recognizing viruses in streamed and file content, and much more. See the wiki!
License
wurp/rabin-tool
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
$Id$ Sliding Window Based Rabin Fingerprint Computation Library o What's it good for? Rabin coding generates a semi-unique fingerprint for each character in a data stream, based on the content of the N characters before it. This means the fingerprint is stable for the same content, regardless of where that content occurs. You can also break a file into chunks based on the modulo of the fingerprint, so the chunk boundaries are the same (except around the changed parts) even if you change content in the middle of file. By comparing chunk hashes you can send or store updates to a large file without transmitting the whole file. What's more, if the file has repeated chunks, you only need store the content once, along with a list of fingerprints. o Function Given a stream of input, compute the rabin-fingerprint of a N-byte-long sliding window, reading the bytes of the stream one by one o Usage - rabinpoly library provides a very simple interface that 1) defines the sliding window size, 2) reads bytes one by one and returns the rabin-fingerprint of the current sliding window, and 3) resets the sliding window. - Look at the example in the example/ directory. The example code reader.cc takes a file and reports the rabin-fingerprints of overlapping 64Byte-long sliding windows reading the bytes one by one. To compile the example program, read README under the directory. - You can find the detail of Rabin Fingerprint in "Rabin, M. O. Fingerprinting by Random Polynomials. Tech. Rep. TR-15-81. Center for Research in Computing Technology. Harvard University, 1981." o History - Taken from David Mazieres's LBFS implementation and modified as a stand-alone version by Niraj Tolia et al. (For Content Addressable Storage) - Modified to support a configurable sliding window size and packaged by Hyang-Ah Kim - 2005.12.2 version 1.0a Added an example file which was missing in the initial distribution Example of file compression using rabin: make cd src ./mkrabin.sh ./rabin -c "some file" > "some file.rabin" ls -l "some file"* rabin -x -o "some file2" "some file.rabin" diff "some file" "some file2"
About
Rabin fingerprinting command line tool. Useful for compression, storing minimal information for multiple revisions of a file, recognizing viruses in streamed and file content, and much more. See the wiki!
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published