RFC(rest): ✨ the abstraction of ratelimit data storage ✨ #8125

vladfrangu · 2022-06-19T18:25:49Z

Discussed in https://github.com/discordjs/discord.js/discussions/8124

^{Originally posted by didinele June 19, 2022}
Preface: this RFC (Request For Comments) is meant to gather feedback and opinions for the feature set and some implementation details for an abstraction of how @discordjs/rest stores ratelimit data.

The goal is to give the end-user a way to control how ratelimit data is stored, enabling things like using Redis to share ratelimits across REST instances (i.e. across different services/processes).

1. The public API

RESTOptions would take a new store?: IRestStore parameter, meaning you'd now be able to do new REST({ store: new YourCustomStore() });

Currently, I imagine stores as simple key-value storage with a few methods:

// name up for discussion
interface IRateLimitStore {
  has(key: string): Awaitable<boolean>;
  get(key: string): Awaitable<RateLimitState>;
  set(key: string, data: RateLimitState): Awaitable<void>;
  delete(key: string): Awaitable<void>;
}

where type Awaitable<T> = T | Promise<T>; and

interface RateLimitState {
  reset: number;
  remaining: number;
  limit: number;
  referenceCount: number;
}

2. Implementation details

First off, what do we use as the key in our little store? The same key we use to store our IHandlers. This will come in handy in a bit.

discord.js/packages/rest/src/lib/RequestManager.ts

Line 308 in 358c3f4

this.handlers.get(`${hash.value}:${routeId.majorParameter}`) ??

From here on out, each IHandler implementation can get its state using await this.manager.rateLimitStore.get(this.id);

And lastly, we need to figure out how to sweep this stuff! The main issue is that doing an interval driven sweep of the store via some lastUsed property wouldn't be very efficient, especially since, in this case, every REST instance using our particular store (e.g. a Redis server) would be redundantly all trying to sweep.

This is where our referenceCount property comes into play. Let's assume we have 2 microservices, each one with its own REST instance, all using a Redis store.

Now, let's assume a route /some/route that has never been hit before. Our first microservice tries to fire up the request, and it eventually needs to create a handler, which we can see being done here:

discord.js/packages/rest/src/lib/RequestManager.ts

Lines 328 to 335 in 358c3f4

    
           private createHandler(hash: string, majorParameter: string) { 
        
           	// Create the async request queue to handle requests 
        
           	const queue = new SequentialHandler(this, hash, majorParameter); 
        
           	// Save the queue based on its id 
        
           	this.handlers.set(queue.id, queue); 
        
           	return queue; 
        
           }

When we create our SequentialHandler, we would call IRateLimitStore#set with queue.id and { ...initialState, referenceCount: 1 }, where the inital state is the current jazz (e.g. limit: -1).

Notice how we set our referenceCount property to 1. Once our second microservice tries to make the same request, we'll check if state already exists using IRateLimitStore#has - which it does - after which we'll simply increment the referenceCount property to 2.

Finally, a few hours later the sweeper will start getting to our handlers:

discord.js/packages/rest/src/lib/RequestManager.ts

Lines 256 to 266 in 358c3f4

    
           this.handlers.sweep((v, k) => { 
        
           	const { inactive } = v; 
        
           	// Collect inactive handlers 
        
           	if (inactive) { 
        
           		sweptHandlers.set(k, v); 
        
           	} 
        
           	this.emit(RESTEvents.Debug, `Handler ${v.id} for ${k} swept due to being inactive`); 
        
           	return inactive; 
        
           });

When this happens, we'll query the state for the handler and decrement the reference count by 1. If it's dropped to 0, it means there's currently no active handlers, and therefore the state can be dropped, leading to a IRateLimitStore#delete call.

The text was updated successfully, but these errors were encountered:

ncs-pl · 2023-01-02T02:09:57Z

The implementation seems good however there are some behavioural problems here.

While reference counting avoids the overhead of interval-based task queues, it requires a lot more of locking, especially in distributed systems (multi-threading or micro-services), this can badly affects the speed of systems.
I think it might be worth benchmarking both solutions to determine which solution is more adapted to the scenario.

As of the sweeping determination, while the idea of garbage-collecting inactive handlers at specified interval is a good solution in runtime-based systems, it has major flaws which need to be considered:

It creates computation spikes at intervals, which will impact the performance of downstream users depending on the environment (the issue is similar to Discord's problem with Go's garbage collector causing spikes, from my understanding).
What is the expected behaviour in case the external store (in your example, a shared Redis cache) in unavailable (network failure, ...). And if the REST service is restarted, does that mean it will remember pending rate limits?
At scale, if multiple instances of REST tries to simultaneously sweep handlers, there is a risk of work duplication resulting on unexpected behaviour and/or useless bandwidth consumption. How to avoid it without locking (which would badly slows the system)?

vladfrangu added feature request packages:rest labels Jun 19, 2022

vladfrangu added this to the rest v1 milestone Jun 19, 2022

iCrawl modified the milestones: rest v1, rest v2 Jul 18, 2022

vladfrangu pinned this issue Aug 7, 2022

Jiralite modified the milestones: rest 2.0.0, rest 3.0.0 Aug 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC(rest): ✨ the abstraction of ratelimit data storage ✨ #8125

RFC(rest): ✨ the abstraction of ratelimit data storage ✨ #8125

vladfrangu commented Jun 19, 2022 •

edited

Loading

1. The public API

2. Implementation details

ncs-pl commented Jan 2, 2023 •

edited

Loading

RFC(rest): ✨ the abstraction of ratelimit data storage ✨ #8125

RFC(rest): ✨ the abstraction of ratelimit data storage ✨ #8125

Comments

vladfrangu commented Jun 19, 2022 • edited Loading

Discussed in https://github.com/discordjs/discord.js/discussions/8124

1. The public API

2. Implementation details

ncs-pl commented Jan 2, 2023 • edited Loading

vladfrangu commented Jun 19, 2022 •

edited

Loading

ncs-pl commented Jan 2, 2023 •

edited

Loading