Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Perf degradation of AcquireTokenForClient due to default partitioned cache #2826

Closed
pmaytak opened this issue Aug 13, 2021 · 13 comments · Fixed by #2834
Closed

[Bug] Perf degradation of AcquireTokenForClient due to default partitioned cache #2826

pmaytak opened this issue Aug 13, 2021 · 13 comments · Fixed by #2834
Assignees
Labels
bug P2 performance regression Behavior that worked in a previous release that no longer works in a newer release
Milestone

Comments

@pmaytak
Copy link
Contributor

pmaytak commented Aug 13, 2021

Is your feature request related to a problem? Please describe.
Starting in MSAL 4.30, there's a default in-memory partitioned cache for confidential client applications. For each cache operation, the data is serialized/deserialiazed, which causes a performance hit. Seems to be a bigger issue for apps that have single-tenant partitions with many resources per tenant.

Possible solutions

  1. Partition the internal cache.
  • Create PartitionedInMemoryTokenCacheAccessor that implements ITokenCacheAccessor and is similar to InMemoryTokenCacheAccessor except that the token dictionaries are partitioned by tenant id.
  • Add overloaded GetAllX methods that accept a tenant parameter.
  • In TokenCache, for confidential client app, set the accessor to the partitioned one.
  1. Explore having a smaller partition key; currently it's client ID + tenant ID.
  1. When searching through the internal cache, find token by key first then go through filters.

Also add performance tests to compare before and after change. (Testing scenario should include single- and multi-tenant cases with many resources.)

@pmaytak pmaytak added this to the 4.36.0 milestone Aug 13, 2021
@bgavrilMS bgavrilMS added bug P2 regression Behavior that worked in a previous release that no longer works in a newer release and removed enhancement Feature Request labels Aug 16, 2021
@bgavrilMS
Copy link
Member

Let's treat this as a performance regression. Since before MSAL 4.30, AcquireTokenForClient would not use any JSON operation, but afterwards it does, leading to increased number of allocations.

@pmaytak pmaytak changed the title [Feature Request] Improve performance of the default partitioned CCA cache [Bug] Improve performance of the default partitioned CCA cache Aug 16, 2021
@bgavrilMS
Copy link
Member

Have a look at ITokenCacheAccessor - maybe we can have a smarter implementation there for app token cache only.

@rymeskar
Copy link

+1, had a similar discussion with @bgavrilMS and @jennyf19 yesterday on the amount of tokens in a single cache line in CCA when targeting multiple downstream dependencies.

One of the quick-non-breaking options discussed for cache implementors is to provide some more context in the TokenCacheNotificationArgs (especially the scopes and whatever else is used for in-cache row filtering) so that the Suggested Cache Key can be extended and thus creating more granular partitions. For example a second property Suggested Fully Token Partitioned Key. This still has the downside of working with json serialization :( Just for a single token.

@bgavrilMS bgavrilMS changed the title [Bug] Improve performance of the default partitioned CCA cache [Bug] Perf degradation of AcquireTokenForClient due to default partitioned cache Aug 18, 2021
@sea1jxr
Copy link

sea1jxr commented Aug 18, 2021

Ideally there would be a way to skip all serialization for an in memory cache.

@bgavrilMS
Copy link
Member

@pmaytak - I think you should focus on idea 1, i.e. a token cache accessor that is key based, i.e. pre-filtered by tenant id. If this works, we can think about generalizing this idea, maybe the app developer can tell us "partition by tenant id AND scope".

For idea 2 - we cannot change the SuggestedCacheKey as it would break existing token caches. Instead, we can expose more fields in TokenCacheNotificationArgs as @rymeskar suggests.

For perf testing, you can use the WebAPI project to simulate high load or the Benchmarking approach. I think the WebAPI looks at things more holistically and is easier to setup up.

@jennyf19
Copy link
Collaborator

I thought we were doing Perf testing consistently and on PRs impacting the cache, so why did we miss this perf issue? are we missing tests? wasn't the cache serialized/deserialized before 4.30 release? Do you we have current numbers on perf compared w/previous versions, to see how much of a perf issue we are talking about.

@pmaytak
Copy link
Contributor Author

pmaytak commented Sep 1, 2021

Included in MSAL.NET 4.36.0 release.

@rymeskar
Copy link

It's great to see the rate at which you manage to improve the caching and resiliency experience in the past weeks and months.

At this front, @pmaytak and @bgavrilMS I was wondering whether are you planning to add support for more granular partitioning even for the non-in-memory-cache scenarios?

@jmprieur
Copy link
Contributor

@rymeskar this is already realized thanks to the SuggestedCacheKey property of the TokenSerializationArgs.

@rymeskar
Copy link

My understanding was that SuggestedCacheKey is coarsely partitioned; i.e. it does not partition by scope/resourceId.

@jmprieur
Copy link
Contributor

Being partitioned by resource ID would not allow for using the refresh token to acquire a token for a different resource.

@rymeskar
Copy link

This makes sense.
For CCA and for app-only scenarios, this should be fine. For CCA and OBO, should we give the customers an otpion to pick either better RT re-use or finer granularization? After all the RT is re-usable just for the validity period of the original user assertion.

@jmprieur
Copy link
Contributor

nom @rymeskar the RT for OBO token is reusable for 90 days (provided, today you give the userassertion, which might have expired, but is used as a key to the cache). @pmaytak is actually improving the experience for OBO tokens used in long running processes:
#2820

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug P2 performance regression Behavior that worked in a previous release that no longer works in a newer release
Projects
None yet
6 participants