This project collects Java implementations of the most prominent consistent hashing algorithms for non-peer-to-peer contexts.
The implemented algorithms are:
- [1997] ring hash by D. Karger et al.
- [1998] rendezvous hash by Thaler and Ravishankar
- [2014] jump hash by Lamping and Veach
- [2015] multi-probe hash by Appleton and O’Reilly
- [2016] maglev hash by D. E. Eisenbud
- [2020] anchor hash by Gal Mendelson et al.
- [2021] dx hash by Chaos Dong and Fang Wang
- [2023] memento hash by M. Coluzzi et al.
- [2024] jumpback hash by Otmar Ertl
- [2024] binomial hash by M. Coluzzi et al.
Each implementation is divided into two classes:
- Each Engine class (e.g., AnchorEngine) contains an accurate implementation of the algorithm as described in the related paper. These classes do not make any consistency check to keep the performance as close as possible to what was claimed in the related papers.
- Each Hash class (e.g., AnchorHash) is a wrapper of the related Engine class allowing every implementation to match the same interface. These classes also perform all the consistency checks needed to grant a safe execution.
The project includes a benchmarking tool designed explicitly for consistent hashing algorithms. The tool allows benchmarking the following metrics in a fair and agnostic way:
- Memory usage: the amount of memory the algorithm uses to store its internal structure.
- Init time: the time the algorithm requires to initialize its internal structure.
- Resize time: the time the algorithm requires to reorganize its internal structure after adding or removing nodes.
- Lookup time: the time the algorithm needs to find the node a given key belongs to.
- Balance: the ability of the algorithm to spread the keys evenly across the cluster nodes.
- Resize balance: the ability of the algorithm to keep its balance after adding or removing nodes.
- Monotonicity: the ability of the algorithm to move the minimum amount of resources when the cluster scales.
You can build the tool using Apache Maven
. It will generate a jar
file called consistent-hashing-algorithms-1.0.0-jar-with-dependencies.jar
. You can then run the jar file providing a configuration file to customize your execution.
The format of the configuration file is described in detail in the src/main/resources/configs/template.yaml
file.
The tool will use the src/main/resources/configs/default.yaml
file that represents the default configuration if no configuration file is provided.
If the config files are not correctly configured, the tool warns the user and tries to continue the execution. It will run only the correctly configured benchmarks. If the proceeding is not possible, the tool will return an error.
Refer to the template.yaml file for a complete explanation of the configurations.
Once the configuration file is ready, you can run the benchmarks with the following command:
$ java -jar consistent-hashing-algorithms-1.0.0-jar-with-dependencies.jar <your-config>.yaml
You can add your own consistent hash algorithm by performing a merge request.
The class implementing your algorithm should be called YourAlgorithm
Engine
.
All the classes subfixed by Engine
implement the consistent hash algorithms as described in the related papers.
You must implement three more classes to compare your algorithm against the available ones using the benchmark tool. Namely:
YourAlgorithm
Hash
: this must implement the ConsistentHash interface and possibly perform all the consistency checks (that can be avoided in theYourAlgorithm
Engine
).YourAlgorithm
EnginePilot
: this must implement the ConsistentHashEnginePilot interface and performs the operations of adding a node, removing a node, and lookup a key by invoking the related methods of theYourAlgorithmEngine
class in the most efficient way.YourAlgorithm
Factory
: this must implement the ConsistentHashFactory interface and provides a convenient way to instantiate the algorithm and the other utility classes.