Colibri is a simple HTTP service built in Rust that implements an in-memory data structure for rate-limiting services. Rate counts are stored in memory so that Colibri can respond quickly.
Note: Restarting a Colibri node will restart any rate-limit counts.
Colibri implements the Token Bucket algorithm for rate-limiting clients. Currently, all Colibri data structures are held in memory without persistence so that it can quickly respond to incoming requests.
Colibri can be run in single-node mode or in multi-node mode. In single-node mode each Colibri node will keep track of rate-limits individually without using any distributed properties. This strategy could potentially work behind a round-robin load balancer that fairly distributes traffic but it gets quickly confusing with interleaved client requests.
In multi-node mode Colibri functions as a distributed hash table, assigning responsibility for distinct client IDs to individual nodes using consistent hashing. This is experimental; for instance, it's not currently designed to work around network partitions or dynamic cluster resizing.
After cloning this repo, you can launch Colibri locally using cargo
:
❯ cargo run -- --rate-limit-max-calls-allowed=2 --rate-limit-interval-seconds=10
Compiling colibri v0.2.0 (/Users/erewok/open_source/colibri)
Finished dev [unoptimized + debuginfo] target(s) in 3.61s
Running `target/debug/colibri --rate-limit-max-calls-allowed=4 --rate-limit-interval-seconds=10`
2023-03-22T15:24:12.893173Z INFO colibri: Starting Cache Expiry background task
2023-03-22T15:24:12.893274Z INFO colibri: Starting Colibri on 0.0.0.0:8000
Then, in another terminal, you can send requests and see them rate-limited:
❯ curl -XPOST -i http://localhost:8000/rl/some-client-identifier
HTTP/1.1 200 OK
content-type: application/json
content-length: 58
date: Wed, 22 Mar 2023 15:26:17 GMT
{"client_id":"some-client-identifier","calls_remaining":1}
❯ curl -XPOST -i http://localhost:8000/rl/some-client-identifier
HTTP/1.1 200 OK
content-type: application/json
content-length: 58
date: Wed, 22 Mar 2023 15:26:20 GMT
{"client_id":"some-client-identifier","calls_remaining":0}
❯ curl -XPOST -i http://localhost:8000/rl/some-client-identifier
HTTP/1.1 429 Too Many Requests
content-length: 0
date: Wed, 22 Mar 2023 15:26:45 GMT
Click here for a terminal demo.
You can run colibri as a single-node in the following way:
$ colibri
2023-03-24T23:22:26.690566Z INFO colibri: Starting Cache Expiry background task
2023-03-24T23:22:26.690666Z INFO colibri: Starting Colibri on 0.0.0.0:8000
You can run a few colibri nodes as part of a cluster in the following way:
$ cargo run -- --listen-port 8000 --node-id 0 --topology http://localhost:8000 --topology http://localhost:8001 --topology http://localhost:8002 --topology http://localhost:8003
$ cargo run -- --listen-port 8001 --node-id 1 --topology http://localhost:8000 --topology http://localhost:8001 --topology http://localhost:8002 --topology http://localhost:8003
$ cargo run -- --listen-port 8002 --node-id 2 --topology http://localhost:8000 --topology http://localhost:8001 --topology http://localhost:8002 --topology http://localhost:8003
In another terminal, if you make some requests, you can see communicating going across the cluster:
❯ curl -i -XPOST http://localhost:8000/rl/a
HTTP/1.1 200 OK
content-type: application/json
content-length: 37
date: Fri, 24 Mar 2023 23:42:19 GMT
{"client_id":"a","calls_remaining":1}
❯ curl -i -XPOST http://localhost:8000/rl/a
HTTP/1.1 200 OK
content-type: application/json
content-length: 37
date: Fri, 24 Mar 2023 23:42:21 GMT
{"client_id":"a","calls_remaining":0}
❯ curl -i -XPOST http://localhost:8000/rl/a
HTTP/1.1 429 Too Many Requests
content-length: 0
date: Fri, 24 Mar 2023 23:42:22 GMT
Back in the terminal of the node we're requesting, we'll see it's pulling from the other node:
2023-03-24T23:42:09.859385Z INFO rate_limit{client_id="a"}: colibri::node: Requesting data from bucket 1
The following configuration options are available for running Colibri:
Usage: colibri [OPTIONS]
Options:
--listen-address <LISTEN_ADDRESS>
IP Address to listen on [env: LISTEN_ADDRESS=] [default: 0.0.0.0]
--listen-port <LISTEN_PORT>
Port to bind Colibri server to [env: LISTEN_PORT=] [default: 8000]
--rate-limit-max-calls-allowed <RATE_LIMIT_MAX_CALLS_ALLOWED>
Max calls allowed per interval [env: RATE_LIMIT_MAX_CALLS_ALLOWED=] [default: 1000]
--rate-limit-interval-seconds <RATE_LIMIT_INTERVAL_SECONDS>
Interval in seconds to check limit [env: RATE_LIMIT_INTERVAL_SECONDS=] [default: 60]
--topology <TOPOLOGY>
In cluster mode, pass other node addresses: order matters! [env: TOPOLOGY=] [default: ]
--hostname <HOSTNAME>
An identifier for this node [env: HOSTNAME=] [default: ]
--node-id <NODE_ID>
An identifier for this node [env: HOSTNAME=] [default: 0]
-h, --help
Print help