-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add per-organization rate limits #3231
Conversation
430b9bc
to
d0508fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok, but I might just call cache "redis", considering we don't use it for a cache now, and we'll probably use it for things other than a cache in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
50043fc
to
2a446c3
Compare
4811d04
to
c88fd87
Compare
internal/server/redis/redis.go
Outdated
redisOptions.Username = options.Username | ||
redisOptions.Password = options.Password | ||
|
||
client = redis.NewClient(redisOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to use the cluster client here? I'm assuming we would want some kind of redundancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends how the Redis cluster is configured. Managed services like GCP's MemoryStore and AWS's ElastiCache can implement redundancy without resorting to Redis Clusters. Implementing other types of redundancy will require more complex configurations such as sentinels or cluster configurations.
For the time being, we can get away with implementing just the basic configuration
72280d3
to
f0ba365
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! A few questions, and a few minor suggestions, but nothing blocking.
It'd be great for us to dig into github.com/go-redis/redis_rate/v9 a bit more as well, and make sure we understanding all the implementation trade-offs made in that library, since it seems we are relying it quite a bit.
I guess our existing API metrics should give us some indication of the number of requests hitting our rate limits, right? It might be interesting to add some more fine grained metrics about exactly which of the rate limits is rejecting the request, but seems like that should be a follow up and not part of this (if we want it).
The limits we have set seem reasonable. We might consider increasing the org limit one to 5000 or 6000 to be extra safe.
"testing" | ||
"time" | ||
|
||
"github.com/alicebob/miniredis/v2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you see as the advantage to this in-process pseudo-redis compared to running a real instance of redis in a separate process or in a container (like we do with postgres).
I imagine this is slightly faster, but we risk the implementation being a bit different than what we use in production? It seems like we might avoid a few extra dependencies used by miniredis
if we went with the out-of-process real redis.
Not a blocker, we can always change it later if we run into problems. Mostly I'm interested in the rational for this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main advantage is speed and not needing to start a redis instance in order to run tests. This makes writing tests easy and fast.
I haven't used miniredis
previous but @pdevine has and he seems happy with it. I've also used Python's redislite
previously for unit tests and it's a similar idea. I'm not that concerned with compatibility since the Redis API itself is relatively simple and we're not using any advanced features (yet). We can move on to an external Redis if it ever becomes a problem.
"gotest.tools/v3/assert" | ||
) | ||
|
||
func TestRedis(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome test coverage!
internal/server/redis/limit.go
Outdated
func LoginBad(r *Redis, key string, limit int) { | ||
if r != nil && r.client != nil { | ||
ctx := context.TODO() | ||
rate := r.client.Incr(ctx, loginKey(key)).Val() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are silently ignoring errors here, and in r.client.Pipelined
above, and in client.SetArgs
below. The return value is a StatusCmd
so the linter doesn't notice that the error is being ignored.
I suspect we should at least log those errors
internal/server/routes.go
Outdated
// TODO: limit should be a per-organization setting | ||
if err := redis.RateOK(a.server.redis, org.ID.String(), 3000); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's still relevant. The rate limit may eventually be different per org so this will need to load the value from settings
f0ba365
to
43c6683
Compare
43c6683
to
7d729da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. One thing I think is common to run into is the need to scale the limits by org size. Though we can tackle that later.
7d729da
to
3f1e7ce
Compare
3f1e7ce
to
9279278
Compare
Yes definitely. That's the intent behind the TODO for per-organization limits |
Summary
Add limits to how many requests can be sent to Infra server on a per-organization, per-minute basis. Only enabled if Redis is configured