Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rest api to reset deprecation indexing cache #78392

Merged
merged 9 commits into from
Oct 1, 2021

Conversation

pgomulka
Copy link
Contributor

@pgomulka pgomulka commented Sep 28, 2021

Deprecation indexing is using RateLimitingFilter to throttle duplicated
deprecations.
This commit adds REST API and transport actions to reset the LRU cache in this filter.

This will not affect the log4j filter used for throttling file appenders.
usage is very simple. Just call DELETE on _logging/deprecation_cache

curl --request DELETE \
  --url http://localhost:9200/_logging/deprecation_cache \
  --header 'Authorization: Basic ZWxhc3RpYzpwYXNzd29yZA=='

closes #78134

Deprecation indexing is using RateLimitingFilter to throttle duplicated
deprecations. This commit allows to reset the LRU cache in this filter.

This will not affect the log4j filter used for thottling file appenders.

closes elastic#78134
@pgomulka pgomulka added :Core/Infra/Logging Log management and logging utilities v8.0.0 v7.16.0 labels Sep 28, 2021
@pgomulka pgomulka self-assigned this Sep 28, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Sep 28, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

/**
Resets deprecation indexing rate limiting cache on each node.
*/
public class DeprecationCacheResetAction extends ActionType<ActionResponse.Empty> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure I extended the right classes in DeprecationCacheResetAction and TransportDeprecationCacheResetAction
My goal was to keep it as simple as possible.
Is there a generic "empty request" somewhere in our codebase already?
Also do we have a way to make sure that a response (even empty) is returned to a rest client only when operations on every node succeeded?

import org.elasticsearch.transport.TransportService;

public class TransportDeprecationCacheResetAction
extends HandledTransportAction<DeprecationCacheResetAction.Request, ActionResponse.Empty> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be a TransportNodesAction. HandledTransportAction would only be handled by a single node, but TransportNodesAction fans out to all nodes (you'll need to create an appropriate single node request/response as well).

@rjernst
Copy link
Member

rjernst commented Sep 29, 2021

Since resetting the cache is not deprecated, I don’t think it should be logged as such. You could log as debug?

@pgomulka
Copy link
Contributor Author

pgomulka commented Sep 29, 2021

we can log as debug. But this would mean that the log line will be in server log files only. It would not appear in deprecation data stream.
I agree that logging as warn deprecation might be confusing (I wanted to avoid this by introducing a new category).
but then again, maybe I am overdoing this and we don't need that information in deprecation data stream at all.

@cjcenizal do you know how the Upgrade Assitant UI is planning to use this API and then present fresh deprecations? Is the "marker" indicating a reset being performed needed at all?

@cjcenizal
Copy link
Contributor

@pgomulka We expect the following workflow:

  1. Users will enable deprecation logging
  2. As applications send requests using deprecated API features, they'll see a log counter increase
  3. They'll review the logs in an analysis UI, such as Discover, to identify the requests at fault
  4. They'll fix and redeploy an application (or all of them)
  5. They'll reset the counter, and wait for the count to go back up
  6. If it goes up, they'll continue the process at step 3, and if it doesn't go up then they'll have confidence their fixes were effective

Does this make sense?

@cjcenizal
Copy link
Contributor

BTW you can test this out yourself by spinning up a fresh 7.16 deployment on Cloud.

@cjcenizal
Copy link
Contributor

@pgomulka Oh sorry I missed your question.

how the Upgrade Assitant UI is planning to use this API

I expect that when the user resets the counter (step 5), we'll send a request to this API to also reset the cache.

@pugnascotia
Copy link
Contributor

The word "counter" makes me worry here - we use a rate limiting mechanism so that we don't keep logging the same message over and over, which means counting the occurrences of each message could be wildly inaccurate.

@cjcenizal
Copy link
Contributor

@pugnascotia I think it all depends on how we present it to the user. For example, we could present it as "a count of unique logs". But I think you and anyone else should try out the UX and see how it feels.

@pgomulka
Copy link
Contributor Author

pgomulka commented Sep 29, 2021

so there is a counter. I suspect it counts how many documents are there in deprecation log data stream
Screenshot 2021-09-29 at 15 39 04
As @pugnascotia mentioned reseting the indexing cache means that deprecations that already were indexed can be indexed again (duplicated).
In step 5:

They'll reset the counter, and wait for the count to go back up

I feel that by reseting the counter you mean removing the documents from deprecation log index? Data streams are append only, so I think it is not possible. The count on the UI would not go down, unless you remember what was the previous count.

With a marker we could query for the last reset cache. Take the timestamp. And then query for documents with timestamp greater then the fetched latest reset cache

@cjcenizal
Copy link
Contributor

This is the deprecation log counter. It's just below the UI in your screenshot.

image

I feel that by reseting the counter you mean removing the documents from deprecation log index?

No. The way we've implemented this counter is we store a timestamp when the user clicks "Reset counter" and count all new docs that have been indexed since that timestamp.

@pgomulka
Copy link
Contributor Author

No. The way we've implemented this counter is we store a timestamp when the user clicks "Reset counter" and count all new docs that have been indexed since that timestamp.

so that solves it all :) I will remove the redundant log indicating a reset
thanks @cjcenizal !

@cjcenizal
Copy link
Contributor

Thanks @pgomulka!

@pgomulka pgomulka merged commit d2de1af into elastic:master Oct 1, 2021
pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Oct 1, 2021
Deprecation indexing is using RateLimitingFilter to throttle duplicated
deprecations.
This commit adds REST API and transport actions to reset the LRU cache in this filter.

This will not affect the log4j filter used for thottling file appenders.

closes elastic#78134
pgomulka added a commit that referenced this pull request Oct 1, 2021
Deprecation indexing is using RateLimitingFilter to throttle duplicated
deprecations.
This commit adds REST API and transport actions to reset the LRU cache in this filter.

This will not affect the log4j filter used for thottling file appenders.

closes #78134
@cjcenizal
Copy link
Contributor

@pgomulka I'd like to open an issue in the Kibana repo and link to some docs. Do we have docs for this API? Or could you add a couple examples to the PR description and I'll just cross-link here?

@pgomulka
Copy link
Contributor Author

pgomulka commented Oct 1, 2021

@cjcenizal good question. I am not sure if this should be treated more like internal API or we should publicly document this. I don't mind either approach.
I will add a usage to the PR description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Logging Log management and logging utilities >deprecation Team:Core/Infra Meta label for core/infra team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose ability to invalidate deprecation log cache
7 participants