Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide API to refresh token after authc lifecycle #108346

Closed
mshustov opened this issue Aug 12, 2021 · 18 comments · Fixed by #120677
Closed

Provide API to refresh token after authc lifecycle #108346

mshustov opened this issue Aug 12, 2021 · 18 comments · Fixed by #120677
Assignees
Labels
enhancement New value added to drive a business result Feature:elasticsearch Feature:http impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. loe:medium Medium Level of Effort NeededFor:Security Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mshustov
Copy link
Contributor

mshustov commented Aug 12, 2021

Security has to deal with a growing number of cases when authc credentials expired and request to Elasticsearch failed with 401 after onAuth hook. Since only Core has access to all the calls to the Elasticsearch service, it should provide an API for Security plugin that would allow it to handle this periodic authentication issue.

Blocks: #104893 (see the issue description for the current and proposed flow diagrams)

The required functionality is listed in #104893 (comment)

What functionality would Security plugin need from Core to handle 401 errors?
- Security plugin would need an extension point it can use to "subscribe" to 401 errors in the scoped Elasticsearch client.
- Security plugin should get access to the original user request that Elasticsearch client was scoped to. We need to provide this request to the Core's CookieSessionStorageFactory we receive at the setup stage to retrieve the session associated with the user's request.
- Security plugin should get access to the authentication error itself, it would help us to understand if it makes sense to try to re-authneticate request.

On top of it, we might need to refactor AuthHeadersStorage and Elasticsearch service to support auth credentials to be updated after onAuth stage.

@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc triage_needed enhancement New value added to drive a business result labels Aug 12, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@afharo
Copy link
Member

afharo commented Aug 16, 2021

Q: Is this an HTTP Service on401 hook or an Elasticsearch client's one?

If Elasticsearch client-only, shouldn't this re-authentication mechanism be implemented on the client-level @elastic/elasticsearch-js?

@mshustov
Copy link
Contributor Author

: Is this an HTTP Service on401 hook or an Elasticsearch client's one?

Yes, in the Elasticsearch service. Maybe we should decouple auth from http service as much as possible. Having this authc logic scattered across http and elasticsearch services may make it difficult to support it in the future.

If Elasticsearch client-only, shouldn't this re-authentication mechanism be implemented on the client-level @elastic/elasticsearch-js?

Could you elaborate on the benefits of this approach? Right now the es client doesn't know about the authc mechanism but operates with provided credentials.

@afharo
Copy link
Member

afharo commented Aug 16, 2021

Could you elaborate on the benefits of this approach? Right now the es client doesn't know about the authc mechanism but operates with provided credentials.

After syncing offline, I noticed I was missing some context. Dismiss my comment 😇

@pgayvallet
Copy link
Contributor

Security plugin would need an extension point it can use to "subscribe" to 401 errors in the scoped Elasticsearch client.

Just for my understanding, what does security plan to do with this? Is it just for logging / audit purpose?

@azasypkin
Copy link
Member

Just for my understanding, what does security plan to do with this? Is it just for logging / audit purpose?

In case we detect that this 401 is because of expired access token, Security will try to refresh it. If refresh succeeds we'll return updated auth headers (and cookie) to the Core. Once core has updated auth headers it can technically re-try request transparently for the consumer?

@pgayvallet
Copy link
Contributor

Once core has updated auth headers it can technically re-try request transparently for the consumer?

@mshustov what this part of the initial scope?

@mshustov
Copy link
Contributor Author

@pgayvallet yes, in the issue title: we might need to refactor AuthHeadersStorage and Elasticsearch service to support auth credentials to be updated after onAuth stage.

@lukeelmers lukeelmers added impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. loe:medium Medium Level of Effort and removed triage_needed labels Oct 29, 2021
@pgayvallet
Copy link
Contributor

Once core has updated auth headers it can technically re-try request transparently for the consumer?

I tried to take a look at how we could implement this behavior. I need to remind that actually, the scoped clients we provide to our consumers are, even if cloaked under our own interface, vanilla Client instances.

Atm, the only two options I see to have this try/catch401/retry behavior would be:

  1. wrap the Client with a proxy (which is something we have been tried to avoid since we've started using the new client)
  2. perform this retry logic in our custom transport

1. feels like a less hacky solution, but that's also (way?) more work and maintenance than 2.

@delvedor do you have any other suggestions on how/where we could perform this request retry logic?

Also, note that if we do need to update AuthHeadersStorage to allow replacing headers after the return from the hook exposed to security, we also need to update the existing scoped client instance(s), and I'm not sure how exactly we can achieve that.

ATM, the headers are passed during the instantiation of the Client in the asScoped call:

asScoped(request: ScopeableRequest) {
const scopedHeaders = this.getScopedHeaders(request);
const scopedClient = this.rootScopedClient.child({
headers: scopedHeaders,
});
return new ScopedClusterClient(this.asInternalUser, scopedClient);

@delvedor correct me if I'm wrong, but there's not way to update a Client's configuration after it has been instantiated, or is there a way?

Maybe keeping a reference of scopedHeaders to update them while keeping the same reference would do the trick, even if it sounds very hacky?

Also, ideally, we would want reactive updates of the other clients using the same request, which will be even harder to achieve.

e.g

const client1 = clusterClient.asScoped(request);
const client2 = clusterClient.asScoped(request);

// causing the 401 scenario because of expired access token
const responseReturnedAfterRetry =  await client1.search({});

// client2 should ideally already be updated with the new headers and avoid to retrigger the 401
await client2.search({});

As a side note, I think that the @elastic/kibana-security implementation would also need to be smart regarding this refresh logic, and be able to identify that a request is currently in this retry stage, e.g

const client1 = clusterClient.asScoped(request);
const client2 = clusterClient.asScoped(request);

// the refresh logic and related ES call would ideally only be performed once 
await Promise.all([client1.search({}), client2.search({})]);

@azasypkin
Copy link
Member

As a side note, I think that the @elastic/kibana-security implementation would also need to be smart regarding this refresh logic, and be able to identify that a request is currently in this retry stage, e.g

// the refresh logic and related ES call would ideally only be performed once
await Promise.all([client1.search({}), client2.search({})]);

While refreshing the same access token multiple times isn't ideal, it's not critical either since Elasticsearch will be returning exactly the same access/refresh token pair during 60s refresh window. In fact, we do this quite often at the authentication stage already, when Kibana UI makes multiple requests at the same time that trigger access token refresh.

Having said that, I agree that there will be enough corner cases Security plugin will need to carefully handle.

@delvedor
Copy link
Member

@delvedor do you have any other suggestions on how/where we could perform this request retry logic?

A custom transport is the best option.

@delvedor correct me if I'm wrong, but there's not way to update a Client's configuration after it has been instantiated, or is there a way?

Custom headers are stored in the transport instance, you can update them by using the headers symbol:

import * as http from 'http'
import { Transport } from '@elastic/elasticsearch'
import { kHeaders } from '@elastic/elasticsearch/lib/symbols'

class KibanaTransport extends Transport {
  updateHeaders (newHeaders: http.IncomingHttpHeaders): void {
    this[kHeaders] = { ...this[kHeaders], ...newHeaders }
  }
}

// or

const newHeaders: http.IncomingHttpHeaders = { ... }
clientInstance.transport[kHeaders] = { ...clientInstance.transport[kHeaders], ...newHeaders }

The code above should work, but you are tampering with internals and it might break in the future.
I don't think the client should provide this feature out of the box, but I can add a test to ensure that this "hack" won't break in the future.

The code above will work for the v8 client, in the v7 client you can access the headers directly with .headers without a symbol, but the update logic is the same.

@pgayvallet
Copy link
Contributor

pgayvallet commented Dec 6, 2021

I don't think the client should provide this feature out of the box

Overall I agree. It wouldn't make much sense to add an API specifically to update only the headers imho. We could imagine an API to update the client's config more globally, but the complexity for some of the options would make it very complicated, and it would be hard to justify it compared to asking consumers to just recreate a new instance of the Client with the updated config.

The issue we have with our specific usage in Kibana is that we need a way for the instances of the client we expose to our API consumer to refresh these header values internally, as we can't predict when we'll need to update them (and we can't just ask consumers to create a new instance of the client in case of such 401, the whole intent of the issue is to manage that internally).

So I guess our only technical options for this are to either:

  • directly changing the headers by updating transport[kHeaders] (with either approach described in Provide API to refresh token after authc lifecycle #108346 (comment))
  • introducing a higher-level wrapper or proxy on top of the Client class to adapt all API calls (e.g client.search({}, { headers }))
    • note: we could also just update the headers of the underlying client via transport[kHeaders] from this wrapper

Not fan of either of those, but if the client ensures that this symbol hack will work by adding unit tests, it's probably easier to go this way rather than reopening the client wrapper/proxy discussion?

Now, regarding the changes required in AuthHeadersStorage and the ES services / ClusterClient to support dynamically handle those header updates:

ATM AuthHeadersStorage.get returns the value of the header directly, because we never though of a situation where we need to reactively listen to changes of this value.

public set = (request: KibanaRequest | Request, headers: AuthHeaders) => {
this.authHeadersCache.set(ensureRawRequest(request), headers);
};
public get: GetAuthHeaders = (request) => {
return this.authHeadersCache.get(ensureRawRequest(request));
};

I guess our only option here is to have it return an observable instead, something like

export type GetAuthHeaders = (request: KibanaRequest) => Observable<AuthHeaders> | undefined;

/** @internal */
export class AuthHeadersStorage {
  private authHeadersCache = new WeakMap<Request, BehaviorSubject<AuthHeaders>>();

  public set = (request: KibanaRequest | Request, headers: AuthHeaders) => {
    const rawRequest = ensureRawRequest(request);
    if(!this.authHeadersCache.has(rawRequest)) {
      this.authHeadersCache.set(rawRequest, new BehaviorSubject(headers));
    } else {
      this.authHeadersCache.get(rawRequest)!.next(headers);
    }
  };
  public get: GetAuthHeaders = (request) => {
    return this.authHeadersCache.get(ensureRawRequest(request));
  };
}

Then we'll want to adapt ClusterClient and/or ScopedClusterClient to subscribe to this observable and updates the headers accordingly on emissions.

My concerns:

  1. ATM the child client we're using for ClusterClient.asScoped is sharing the parent's transport

asScoped(request: ScopeableRequest) {
const scopedHeaders = this.getScopedHeaders(request);
const scopedClient = this.rootScopedClient.child({
headers: scopedHeaders,
});
return new ScopedClusterClient(this.asInternalUser, scopedClient);
}

@delvedor correct me if I'm wrong, but if we go with this transport[kHeaders] = approach, we gonna need to have each child client have its own instance of our custom KibanaTransport, as the headers are bound to the Transport instance, right? Is this going to be an issue with shared connection pools or not?

Btw, are two clients instanciated via

    const scopedClient = this.rootScopedClient.child({
      headers: scopedHeaders,
    });

sharing the same Transport instance, or is a new instance of the Transport created for each of them?

  1. I'm not sure WeakMap will be good enough now that the values are observables that will have subscriptions

WeakMap was working because the value was a plain object. If we change AuthHeadersStorage to have observables instead, that we're going to subscribe to from either ClusterClient or ScopedClusterClient, I think this may keep references on the WeakMap, which would block the GC from collecting the entries that are no longer referenced by key? I'm not sure of this assumption, but I suspect using observable that low in the chain could lead to leaks, and force us to manually unsubscribe from the observables when the request is terminated. We'll need to investigate this, potentially.

  1. overall, this is a lot of complexity increase (just) for this feature

Not saying this should be a reason not to try to do it, but just mentioning.

@pgayvallet pgayvallet self-assigned this Dec 7, 2021
@pgayvallet
Copy link
Contributor

FWIW, I'm going to start a POC to identify potential other blockers.

@pgayvallet
Copy link
Contributor

@delvedor, TransportOptions and all symbols are not exported from @elastic/transport, meaning that they are not importable from @elastic/elasticsearch via the

// node_modules/@elastic/elasticsearch/index.d.ts
export * from '@elastic/transport'

Accessing the kHeaders symbol and the TransportOptions type currently requires kibana to add an explicit dependency against @elastic/transport (that we currently don't have).

Note that this is also error prone, because if the version we have as a direct dependency and the version used by @elastic/elasticsearch is not the same, the symbols also aren't.

E.g by adding "@elastic/transport": "^8.0.0-beta.2", with "@elastic/elasticsearch": "npm:@elastic/elasticsearch-canary@^8.0.0-canary.35", which uses "@elastic/transport" "^0.0.15":

Screenshot 2021-12-07 at 14 38 53

I had to align the explicit version of the transport with the implicit one. Do you think it would be alright to expose those symbols (and the TransportOptions type) from the index of the transport module to have it importable in Kibana directly via @elastic/elasticsearch?

@pgayvallet
Copy link
Contributor

pgayvallet commented Dec 7, 2021

Now for @elastic/kibana-security:

I'm a bit lost with the scope of this issue. I initially thought the responsibility of the hook would be to return the updated authentication headers, and that core's responsibility with them would only be to:

  1. Update the associated request headers (entry in the authRequestHeaders record of the http service)
  2. Update the headers of the scoped client that initiated the request, and retry the request with the updated headers

However when looking at #104893 (comment), I'm kinda scared. Are we supposed to allow the hook to return values for the whole authentication state (state, authResponseHeaders and authRequestHeaders) in the same way it's done in the auth/login hook from HTTP here?

this.server.auth.scheme('login', () => ({
authenticate: adoptToHapiAuthFormat(
fn,
this.log,
(req, { state, requestHeaders, responseHeaders }) => {
this.authState.set(req, state);
if (responseHeaders) {
this.authResponseHeaders.set(req, responseHeaders);
}
if (requestHeaders) {
this.authRequestHeaders.set(req, requestHeaders);
// we mutate headers only for the backward compatibility with the legacy platform.
// where some plugin read directly from headers to identify whether a user is authenticated.
Object.assign(req.headers, requestHeaders);
}
}
),

Because doing so would require to totally de-couple authc from http, which is some significant additional (well, preliminary) work.

Also the current authc API

export type AuthenticationHandler = (
request: KibanaRequest,
response: LifecycleResponseFactory,
toolkit: AuthToolkit
) => AuthResult | IKibanaResponse | Promise<AuthResult | IKibanaResponse>;

Allows things (such as redirecting the user, or returning a KibanaResponse to return to the client) that can only be done if the hook is called during the HTTP lifecycles (wouldn't make any sense to have the authc handler perform/return a redirect if invoked because an ES call returned a 401).

We'll probably need a better specification of what exactly you had in mind regarding the hook API and what core has to do with its response.

Also this part of the original specification bothers me:

  1. Only when request to Elasticsearch is attributed with Authorization header provided by the AuthenticationHandler (the one registered via registerAuth hook). If consumers override client's Authorization header with FakeRequest and alike we shouldn't interfere.
  2. Only when request isn't made by the Security plugin itself (otherwise we can get into infinite recursion 🙂 ). This will be probably solved automatically by the solution for the previous point since we always override Authorization header in security plugin. But if not, maybe we'd need a special TransportRequestOptions option?
  1. We can't treat the security plugin as a higher-level citizen, so point 2. is not really an option
  2. We don't really have a way to make sure that the headers provided to a scoped ES client are the ones coming from the Authorization header.. Do you have a suggestion on how to make this distinction? Would ensuring that the scoped's client request is a real KibanaRequest be good enough?

@pgayvallet
Copy link
Contributor

I created a POC in #120677 to discuss both about the API design/signatures, and the proposed technical implementation. @delvedor @elastic/kibana-security @elastic/kibana-core if you mind taking a look

@delvedor
Copy link
Member

delvedor commented Dec 9, 2021

sharing the same Transport instance, or is a new instance of the Transport created for each of them?

@pgayvallet every child client has its own Transport instance.

correct me if I'm wrong, but if we go with this transport[kHeaders] = approach, we gonna need to have each child client have its own instance of our custom KibanaTransport, as the headers are bound to the Transport instance, right? Is this going to be an issue with shared connection pools or not?

It will not cause any issue with the connection pool.

TransportOptions and all symbols are not exported from @elastic/transport, meaning that they are not importable from @elastic/elasticsearch via the

@pgayvallet you are right, symbols are not exported as they should not be part of the public API. While TransportOptions should be re-exported, I'm a bit on the fence about exposing the symbols, since they are not part of the public API.

@pgayvallet
Copy link
Contributor

The design POC proposed in #120677 has been validated as being sufficient for the needs of @elastic/kibana-security, and the approach did not receive any blocking feedback, so I guess this settles the design phase.

The implementation phase has been prioritized into our 'next sprint' queue, so we should start it in the next few weeks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:elasticsearch Feature:http impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. loe:medium Medium Level of Effort NeededFor:Security Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants