Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: additional caching for improved handler performance #1953

Closed
wants to merge 4 commits into from

Conversation

mattcosta7
Copy link
Contributor

@mattcosta7 mattcosta7 commented Jan 6, 2024

The intent of this PR is to help discover if there's a good general purpose cache api we can expose from the handlers, and if not we can see where caching improves performance further and how much.

These are mostly changes first looked at in #1905

This is a majority of the low-hanging request cycle optimizations I think we can do with good benefit and little overhead.
Further performance could be gained with a slightly modified storage structure, potentially (instead of solely linear request parsing, we could order them at registration into a more treelike structure, however this seems fairly complex compared to the low-hanging performance we can unlock from the small changes in this pr

This pull request introduces changes primarily aimed at improving the efficiency of URL and cookie parsing in the codebase. The changes involve the creation of a caching mechanism to store parsed URLs and cookies from requests, reducing the need for repeated parsing of the same data. The changes affect several files, with the most significant changes in src/core/handlers/GraphQLHandler.ts, src/core/handlers/HttpHandler.ts, and src/core/handlers/RequestHandler.ts.

URL and Cookie Parsing Improvements:

Request,
Record<string, string>
>()
protected static parseAllRequestCookiesOrGetFromCache(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cookies can be parsed once per request, not required once per handler

}

private static matchRequestUrlCache = new Map<string, Match>()
protected static matchRequestURLOrGetMatchFromCache(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL matching can be computed once per string comparison of URL/match/baseUrl - and reused for subsequent requests across all handlers. There's some memory cost to this map, but even at very large sizes it should be fine

const requestUrl = request.url
let url = urlCache.get(requestUrl)
if (!url) {
url = new URL(requestUrl)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL allocations aren't cheap as they do expensive string parsing to create a URL object. we can cache this per string url, and avoid those object allocations and parses, which may occur multiple times for a single request lookup

Copy link
Member

@kettanaito kettanaito Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. Thanks for the brief explanation!

Copy link
Contributor Author

@mattcosta7 mattcosta7 Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're also not super expensive, so definitely a micro-optimization we could totally skip

It's definitely the least important change. on the order of 10ms for 10,000 requests of overhead. doubtful that it'll matter to most!

Screen Shot 2024-01-07 at 10 57 55 PM Screen Shot 2024-01-07 at 10 56 13 PM

I was mostly just trying to get a "what's left that is low hanging" branch together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say 10ms save on 10,000 requests is not worth the price we are paying here. Would you agree that we can skip this pull request as of now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was just the url part but I agree we don't need to ship this right now as they're generally small improvements. Very much just a "what other things can be optimized easily" experiment

@mattcosta7 mattcosta7 changed the title More caching for performance fix: additional caching for improved handler performance Jan 6, 2024
@mattcosta7 mattcosta7 marked this pull request as ready for review January 6, 2024 06:14
@mattcosta7 mattcosta7 self-assigned this Jan 6, 2024
@@ -1,7 +1,7 @@
{
"compilerOptions": {
"strict": true,
"target": "es6",
"target": "es2020",
Copy link
Contributor Author

@mattcosta7 mattcosta7 Jan 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a perf fix, because async/await no longer become generators
see #1780

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move this to a separate pull request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move this to a separate pull request?

yep - it's part of 1780, so I think we can just let that land (it sets esnext, which i think should be ok?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so!

@@ -164,21 +165,27 @@ export class GraphQLHandler extends RequestHandler<
* If the request doesn't match a specified endpoint, there's no
* need to parse it since there's no case where we would handle this
*/
const match = matchRequestUrl(new URL(args.request.url), this.endpoint)
const match = this.matchRequestURLOrGetMatchFromCache(
Copy link
Member

@kettanaito kettanaito Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const match = this.matchRequestURLOrGetMatchFromCache(
const match = this.matchRequestOrGetFromCache(

urlFromRequestOrCache(args.request),
this.endpoint,
)
const cookies = this.parseAllRequestCookiesOrGetFromCache(args.request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const cookies = this.parseAllRequestCookiesOrGetFromCache(args.request)
const cookies = this.parseRequestCookiesOrGetFromCache(args.request)

@@ -184,6 +186,36 @@ export abstract class RequestHandler<
})
}

private static cookiesForRequestCache = new WeakMap<
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static cookiesForRequestCache = new WeakMap<
private static requestCookiesCache = new WeakMap<

Request,
Record<string, string>
>()
protected parseAllRequestCookiesOrGetFromCache(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected parseAllRequestCookiesOrGetFromCache(
protected parseRequestCookiesOrGetFromCache(

return cookies
}

private static matchRequestUrlCache = new Map<string, Match>()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not WeakMap here also? Are we planning on freeing these resources manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WeakMap can't accept string keys, but mostly because this cache is actually more valuable if it persists between requests. Since multiple requests can have the same URL string, this maintains the cache across requests, avoiding the new URL overhead for multiple requests.

It's a slight extra optimization, but these shouldn't be so huge where they become an issue since it should be relatively small

We could later consider something more like an lru here if we need to evict things

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. The choice of Map is clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that these changes wouldn't hurt the readability of the code. The benefit they provide is tangible but rather minimal. I'm fine with having them as long as the code doesn't become a jungle. I suggested a cached helper to make this common cached logic reused, I'd like to know what you think about it.

}

private static matchRequestUrlCache = new Map<string, Match>()
protected matchRequestURLOrGetMatchFromCache(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected matchRequestURLOrGetMatchFromCache(
protected matchRequestOrGetFromCache(

@@ -92,7 +93,7 @@ function extractMultipartVariables<VariablesType extends GraphQLVariables>(
async function getGraphQLInput(request: Request): Promise<GraphQLInput | null> {
switch (request.method) {
case 'GET': {
const url = new URL(request.url)
const url = urlFromRequestOrCache(request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are URL constructors really that costly? I suspect this optimization isn't worth the cost of readability and debugging.

@@ -4,7 +4,10 @@ export function use(
currentHandlers: Array<RequestHandler>,
...handlers: Array<RequestHandler>
): void {
currentHandlers.unshift(...handlers)
// we don't spread the handlers to avoid maximum stack errors on very large sets of handlers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a test for this use case?

@@ -4,7 +4,10 @@ export function use(
currentHandlers: Array<RequestHandler>,
...handlers: Array<RequestHandler>
): void {
currentHandlers.unshift(...handlers)
// we don't spread the handlers to avoid maximum stack errors on very large sets of handlers
for (let i = handlers.length - 1; i >= 0; i--) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if .reduceRight wouldn't be more readable here?

handlers.reduceRight((_, handler) => {
  currentHandlers.unshift(handler)
})

@@ -0,0 +1,17 @@
const urlCache = new Map<string, URL>()
Copy link
Member

@kettanaito kettanaito Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not WeakMap<Request, URL>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly to keep URL objects between requests, likely a slight over optimization, which keeps subsequent requests slightly more efficient at the expense of memory.

Both should be generally helpful

* parsing the same url multiple times.
*/
export function urlFromRequestOrCache(request: Request): URL {
const requestUrl = request.url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be adopting this "look-in-cache-or-create" pattern quite extensively over these two pull requests. I think it may be worth exploring a tiny abstraction to make these actions more readable.

function cached<V, K extends object>(producer: () => V, key: K, cache: WeakMap<K, V>): V {
  if (!cache.has(key)) {
    cache.set(key, producer())
  }
  
  return cache.get(key)!
}

// Usage.
const urlCache = new WeakMap<Request, URL>
export function cachedUrlFromRequest(request: Request) {
  return cached(() => new URL(request.url), request, urlCache)
}

@mattcosta7 mattcosta7 closed this Jan 12, 2024
@kettanaito
Copy link
Member

@mattcosta7, I still appreciate the effort on this. Thank you so much for making MSW better!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants