feat: add per-organization rate limits #3231

mxyng · 2022-09-16T20:50:37Z

Summary

Add limits to how many requests can be sent to Infra server on a per-organization, per-minute basis. Only enabled if Redis is configured

ssoroka

Looks ok, but I might just call cache "redis", considering we don't use it for a cache now, and we'll probably use it for things other than a cache in the future.

internal/server/cache/cache.go

dnephin

Looks good!

internal/server/cache/cache.go

internal/cmd/server.go

pdevine · 2022-09-21T04:42:30Z

internal/server/redis/redis.go

+		redisOptions.Username = options.Username
+		redisOptions.Password = options.Password
+
+		client = redis.NewClient(redisOptions)


Do you need to use the cluster client here? I'm assuming we would want some kind of redundancy.

It depends how the Redis cluster is configured. Managed services like GCP's MemoryStore and AWS's ElastiCache can implement redundancy without resorting to Redis Clusters. Implementing other types of redundancy will require more complex configurations such as sentinels or cluster configurations.

For the time being, we can get away with implementing just the basic configuration

internal/server/routes.go

dnephin

Looks great! A few questions, and a few minor suggestions, but nothing blocking.

It'd be great for us to dig into github.com/go-redis/redis_rate/v9 a bit more as well, and make sure we understanding all the implementation trade-offs made in that library, since it seems we are relying it quite a bit.

I guess our existing API metrics should give us some indication of the number of requests hitting our rate limits, right? It might be interesting to add some more fine grained metrics about exactly which of the rate limits is rejecting the request, but seems like that should be a follow up and not part of this (if we want it).

The limits we have set seem reasonable. We might consider increasing the org limit one to 5000 or 6000 to be extra safe.

internal/errors.go

internal/server/errors.go

dnephin · 2022-09-23T22:38:08Z

internal/server/redis/redis_test.go

+	"testing"
+	"time"
+
+	"github.com/alicebob/miniredis/v2"


What do you see as the advantage to this in-process pseudo-redis compared to running a real instance of redis in a separate process or in a container (like we do with postgres).

I imagine this is slightly faster, but we risk the implementation being a bit different than what we use in production? It seems like we might avoid a few extra dependencies used by miniredis if we went with the out-of-process real redis.

Not a blocker, we can always change it later if we run into problems. Mostly I'm interested in the rational for this approach.

The main advantage is speed and not needing to start a redis instance in order to run tests. This makes writing tests easy and fast.

I haven't used miniredis previous but @pdevine has and he seems happy with it. I've also used Python's redislite previously for unit tests and it's a similar idea. I'm not that concerned with compatibility since the Redis API itself is relatively simple and we're not using any advanced features (yet). We can move on to an external Redis if it ever becomes a problem.

dnephin · 2022-09-23T22:39:55Z

internal/server/redis/redis_test.go

+	"gotest.tools/v3/assert"
+)
+
+func TestRedis(t *testing.T) {


Awesome test coverage!

internal/server/redis/limit.go

dnephin · 2022-09-23T23:05:11Z

internal/server/redis/limit.go

+func LoginBad(r *Redis, key string, limit int) {
+	if r != nil && r.client != nil {
+		ctx := context.TODO()
+		rate := r.client.Incr(ctx, loginKey(key)).Val()


We are silently ignoring errors here, and in r.client.Pipelined above, and in client.SetArgs below. The return value is a StatusCmd so the linter doesn't notice that the error is being ignored.

I suspect we should at least log those errors

internal/server/redis/limit_test.go

internal/server/handlers.go

dnephin · 2022-09-23T23:14:22Z

internal/server/routes.go

+			// TODO: limit should be a per-organization setting
+			if err := redis.RateOK(a.server.redis, org.ID.String(), 3000); err != nil {


Is this TODO still relevant?

Yes it's still relevant. The rate limit may eventually be different per org so this will need to load the value from settings

ssoroka

lgtm. One thing I think is common to run into is the need to scale the limits by org size. Though we can tackle that later.

mxyng · 2022-09-26T18:15:04Z

lgtm. One thing I think is common to run into is the need to scale the limits by org size. Though we can tackle that later.

Yes definitely. That's the intent behind the TODO for per-organization limits

mxyng requested review from ssoroka, BruceMacD, dnephin, jmorganca, pdevine and kimskimchi as code owners September 16, 2022 20:50

mxyng marked this pull request as draft September 16, 2022 20:50

mxyng force-pushed the mxyng/rate-limit branch 3 times, most recently from 430b9bc to d0508fb Compare September 19, 2022 17:45

ssoroka reviewed Sep 19, 2022

View reviewed changes

internal/server/cache/cache.go Outdated Show resolved Hide resolved

dnephin reviewed Sep 20, 2022

View reviewed changes

internal/server/cache/cache.go Outdated Show resolved Hide resolved

internal/server/cache/cache.go Outdated Show resolved Hide resolved

internal/server/cache/cache.go Outdated Show resolved Hide resolved

internal/cmd/server.go Outdated Show resolved Hide resolved

mxyng force-pushed the mxyng/rate-limit branch 3 times, most recently from 50043fc to 2a446c3 Compare September 20, 2022 19:02

mxyng marked this pull request as ready for review September 20, 2022 19:46

mxyng force-pushed the mxyng/rate-limit branch 2 times, most recently from 4811d04 to c88fd87 Compare September 21, 2022 00:30

pdevine reviewed Sep 21, 2022

View reviewed changes

mxyng force-pushed the mxyng/rate-limit branch 9 times, most recently from 72280d3 to f0ba365 Compare September 23, 2022 00:39

dnephin approved these changes Sep 23, 2022

View reviewed changes

mxyng force-pushed the mxyng/rate-limit branch from f0ba365 to 43c6683 Compare September 26, 2022 17:01

mxyng force-pushed the mxyng/rate-limit branch from 43c6683 to 7d729da Compare September 26, 2022 17:09

ssoroka approved these changes Sep 26, 2022

View reviewed changes

maintain: add error handling for Too Many Requests

d8d828f

mxyng force-pushed the mxyng/rate-limit branch from 7d729da to 3f1e7ce Compare September 26, 2022 18:07

mxyng added 12 commits September 26, 2022 11:12

feat: add basic per-organization rate limiting

fa07c17

improve: use redis_rate for rate limiter

6e45f3b

maintain: rename cache to redis

ab99f5b

improve: load redis password from secret

a15a272

maintain: add redis tests

93272bd

maintain: fix server config tests

954ad8d

maintain: fail fast on redis config error

cc7729e

improve: add Retry-After header when over limit

a0a4cec

improve: add redis tls options

4ece4b5

improve: add password reset, forgot domain rate limit

5ddc180

improve: login rate limit

91e791c

maintain: comments

9279278

mxyng force-pushed the mxyng/rate-limit branch from 3f1e7ce to 9279278 Compare September 26, 2022 18:14

mxyng merged commit 866ec67 into main Sep 26, 2022

mxyng deleted the mxyng/rate-limit branch September 26, 2022 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add per-organization rate limits #3231

feat: add per-organization rate limits #3231

mxyng commented Sep 16, 2022

ssoroka left a comment •

edited

Loading

dnephin left a comment

pdevine Sep 21, 2022

mxyng Sep 21, 2022

dnephin left a comment

dnephin Sep 23, 2022

mxyng Sep 24, 2022

dnephin Sep 23, 2022

dnephin Sep 23, 2022

dnephin Sep 23, 2022

mxyng Sep 24, 2022

ssoroka left a comment

mxyng commented Sep 26, 2022

		// TODO: limit should be a per-organization setting
		if err := redis.RateOK(a.server.redis, org.ID.String(), 3000); err != nil {

feat: add per-organization rate limits #3231

feat: add per-organization rate limits #3231

Conversation

mxyng commented Sep 16, 2022

Summary

ssoroka left a comment • edited Loading

Choose a reason for hiding this comment

dnephin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dnephin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssoroka left a comment

Choose a reason for hiding this comment

mxyng commented Sep 26, 2022

ssoroka left a comment •

edited

Loading