Skip to content

Rabin fingerprinting command line tool. Useful for compression, storing minimal information for multiple revisions of a file, recognizing viruses in streamed and file content, and much more. See the wiki!

License

Notifications You must be signed in to change notification settings

wurp/rabin-tool

Repository files navigation

$Id$

Sliding Window Based Rabin Fingerprint Computation Library

o What's it good for?
  Rabin coding generates a semi-unique fingerprint for each character in a
  data stream, based on the content of the N characters before it.  This
  means the fingerprint is stable for the same content, regardless of where
  that content occurs.
  
  You can also break a file into chunks based on the modulo of the
  fingerprint, so the chunk boundaries are the same (except around the
  changed parts) even if you change content in the middle of file. By
  comparing chunk hashes you can send or store updates to a large file
  without transmitting the whole file. What's more, if the file has
  repeated chunks, you only need store the content once, along with
  a list of fingerprints.
  
o Function
  Given a stream of input, compute the rabin-fingerprint of a N-byte-long 
  sliding window, reading the bytes of the stream one by one

o Usage
  - rabinpoly library provides a very simple interface that 
	1) defines the sliding window size, 
	2) reads bytes one by one and returns the rabin-fingerprint of 
	   the current sliding window, and 
	3) resets the sliding window.

  - Look at the example in the example/ directory. The example code 
    reader.cc takes a file and reports the rabin-fingerprints of overlapping 
    64Byte-long sliding windows reading the bytes one by one. To compile the
    example program, read README under the directory.

  - You can find the detail of Rabin Fingerprint in
   "Rabin, M. O. Fingerprinting by Random Polynomials. Tech. Rep. TR-15-81. 
    Center for Research in Computing Technology. Harvard University, 1981."

o History
  - Taken from David Mazieres's LBFS implementation and modified as a 
    stand-alone version by Niraj Tolia et al. (For Content Addressable Storage) 
  - Modified to support a configurable sliding window size and packaged by 
    Hyang-Ah Kim
  - 2005.12.2 version 1.0a 
    Added an example file which was missing in the initial distribution


Example of file compression using rabin:
make
cd src
./mkrabin.sh
./rabin -c "some file" >  "some file.rabin"
ls -l "some file"*
rabin -x -o "some file2" "some file.rabin"
diff "some file" "some file2"

About

Rabin fingerprinting command line tool. Useful for compression, storing minimal information for multiple revisions of a file, recognizing viruses in streamed and file content, and much more. See the wiki!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published