High availability

For client credential (app 2 app) flow, please see https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/wiki/Client-credential-flows which has a topic on High-Availablity first.

Use the latest MSAL

Semantic versioning is followed to the letter. Use the latest MSAL to get the latest bug fixes.

You also want to check if you should use Microsoft Identity Web, a higher level library for web apps and web APIs, which does a lot of what is described below for your. See Is MSAL right for me?, which proposes a decision tree to choose the best solution depending on your platform and constraints.

Use the token cache

Default behaviour: MSAL caches the tokens in memory. Each ConfidentialClientApplication instance has its own internal token cache. In-memory cache can be lost, for example, if the object instance is disposed or the whole application is stopped.

Recommendation: All apps should persist their token caches. Web apps and Web APIs should use an L1 / L2 token cache where L2 is a distributed store like Redis to handle scale. Desktop apps should use this token cache serialization strategy.

Note: if you use Microsoft.Identity.Web, you don't need to worry about the cache, as it implements the right cache behavior. If you don't use Microsoft.Identity.Web but are building a web app or web API, you'd want to consider an hybrid approach

Default behaviour: MSAL maintains a secondary ADAL token cache for migration scenarios between ADAL and MSAL. ADAL cache operations are very slow. Recommendation: Disable ADAL cache if you are not interested in migrating from ADAL. This will make a BIG perf improvement - see perf measurements here.

Add WithLegacyCacheCompatibility(false) when constructing your app to disable ADAL caching.

Add monitoring around MSAL operations

MSAL exposes important metrics as part of AuthenticationResult.AuthenticationResultMetadata object:

Metric	Meaning	When to trigger an alarm?
DurationTotalInMs	Total time spent in MSAL, including network calls and cache	Alarm on overall high latency (>1s). Value depends on token source. From the cache: 1 cache access. From AAD: 2 cache accesses + 1 HTTP call. First ever call (per-process) will take longer due to 1 extra HTTP call.
DurationInCacheInMs	Time spent loading or saving the token cache, which are customized by the app developer (e.g. save to Redis). Alarm on spikes.
DurationInHttpInMs	Time spent making HTTP calls to AAD. Alarm on spikes.
TokenSource	Indicates the source of the token. Tokens are retrieved from the cache much faster (e.g. ~100ms vs ~700ms). Can be used to monitor and alarm the cache hit ratio.

Retry Policy

Default behaviour: MSAL will retry failed 5xx requests once.

Recommendation:

Add your own retry logic around AcquireToken* methods, using a library like Poly.
ESTS may reply with a 429 Too Many Requests that contains a Retry-After header. Make sure to obey this value, otherwise you will get throttled. More details about Retry-After

One Confidential Client per session

In web app and web API scenarios, it is recommended to use a new ConfidentialClientApplication on each session and to serialize in the same way - one token cache per session. This scales well and also increases security. The official samples show how to do this.

Note: Microsoft.Identity.Web does this.

HttpClient

Default behaviour: An HttpClient is created for each PublicClientApplication / ConfidentialClientApplication. This does not scale well for web sites / web API where we recommend to have a ClientApplication object for each user session.

Recommendation: Provide your own scalable HttpClientFactory. On .NET Core we recommend that you inject the System.Net.Http.IHttpClientFactory. This is described in more detail here.

Pro-Active Token renewal

Goal

Increase application availability by issuing longer lived access tokens and implementing a pro-active renewal strategy.

Status quo

By default, AAD issues access tokens with a 1h expiration. If an AAD outage occurs when a refresh is needed, MSAL will fail. The failure propagates to the calling application and impacts availability.

Pro-active token renewal

To overcome this MSAL tries to ensure than an app always has fresh tokens. AAD outages rarely take more than a few hours, so if MSAL can guarantee that a token always has at least a few hours of availability left, the application will not be impacted by the AAD outage.

Use MSAL.NET and configure a token lifetime of more than 1h

Then observe the refresh_in field in the response from ESTS:

Getting started with MSAL.NET

Acquiring tokens

AcquireTokenSilent

Desktop/Mobile apps

AcquireTokenInteractive
WAM - the Windows broker
.NET Core
Xamarin Docs
UWP
Custom Browser
Applying an AAD B2C policy
Integrated Windows Authentication for domain or AAD joined machines
Username / Password
Device Code Flow for devices without a Web browser
ADFS support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly