Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to exclude (big) query results from .watcher-history #36719

Closed
ypid-geberit opened this issue Dec 17, 2018 · 5 comments
Closed

Support to exclude (big) query results from .watcher-history #36719

ypid-geberit opened this issue Dec 17, 2018 · 5 comments

Comments

@ypid-geberit
Copy link

Describe the feature:

When running my aggregated_issues_in_logs watch over a large number of documents the watch finds and aggregates a large number of documents that are then indexed into a new index in ES. The issues is that the whole watch run with all results is written to .watcher-history-* as one document. This results in very big documents (result.input.payload and result.transform.payload). The practical issue with this is that the watch history list in Kibana (Management -> Elasticsearch -> Watcher -> Watches -> log_issues) is unable to show more than one watch execution.

It would be helpful if some "filter_path"/"exclude_filter_path" could be specified in the watch definition which should end up in .watcher-history-*.

@ypid-geberit ypid-geberit changed the title Support to exclude query (big) results from .watcher-history Support to exclude (big) query results from .watcher-history Dec 17, 2018
@albertzaharovits
Copy link
Contributor

@ypid-geberit I think this fits better in the Kibana realm. I am thinking of a way of interpreting the contents of the watch metadata, where you would store the path filtering. This sounds kludgy, but maybe they have a better idea. Can you please open a feature request there https://github.com/elastic/kibana ? I am going to close this.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@ypid-geberit
Copy link
Author

@albertzaharovits I still think this should be implemented in Elasticsearch. I also had this idea of using a metadata field to specify path filtering, example:

---

# yamllint disable rule:line-length rule:comments-indentation

metadata:
  comment: 'Test watch'
  watcher_history_exclude_filter_path: 'result.input.payload,result.transform.payload'


throttle_period: '0s'

trigger:
  schedule:

input:
  search:

condition:

transform:

actions:

(Yes, YAML is awesome, also for watch definitions. Ref: elastic/examples#239)

But this has the issue that metadata is now evaluated as setting so I would instead propose:

---

# yamllint disable rule:line-length rule:comments-indentation

metadata:
  comment: 'Test watch'

watcher_history_exclude_filter_path: 'result.input.payload,result.transform.payload'

trigger:
  schedule:

input:
  search:

actions:

What do you have in mind to implement this in Kibana? I only have one idea to midigate this in KIbana which is "Index Patterns" -> "Source Filters" (result.*.payload,result.*.body,result.actions,input.search.request). Is that something you mean?

The reason I suggested to implement this in the watch definition is that it is specific to the watch in my case. For some watches, I find it useful to have everything in the history (development and staging watches), for other watches (productive) which write their output to other indices in ES anyway I don’t want to have it duplicated in .watcher-history-* (we are talking about hundreds of MiB of .watcher-history-* per day in our environment).

@albertzaharovits
Copy link
Contributor

albertzaharovits commented Dec 20, 2018

Hi @ypid-geberit

I understand now, you mean not recording specific watch history fields for specific watches. This indeed sounds like a watcher feature. I got lead astray when you mentioned as a motivator the visibility problem in kibana.

If the problem is the size of the .watcher-history-* indices, may I recommend using update by query.
This way, the fields can be recorded even in "production" mode and when they're deemed more of a burden than useful they can be removed.

I understand that it's easier to not record specific fields in the first place, rather than having to maintain a "curation" mechanism, but from your experience, do you see the added value of not recording some watch history "in production", given the many ways a watch can fail?

@ypid-geberit
Copy link
Author

ypid-geberit commented Jul 25, 2019

I tried it with update by query now but for some reason this crashed a Elasticsearch 6.2.4 cluster three times now. The second run was limited to one day worth of watcher history. The cluster is otherwise stable.

Script because Curator does not support this yet:

#!/bin/bash

PATH="/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/lib/mit/bin"

(
    date --rfc-3339=seconds
    for day_offset in $(seq 32); do
        date_timestamp="$(date --date "now - ${day_offset} day" '+%Y.%m.%d')"
        echo "$(date --rfc-3339=seconds): Running for $date_timestamp"
        curl --silent --cacert "$(get_es_cacert)" -u "$(get_es_creds)" "$(get_es_url)/.watcher-history-7-${date_timestamp}/_update_by_query" -H 'Content-    Type: application/yaml' --data-binary @/etc/curator/watcher-history_update_by_query.json
    done
) >> /var/log/curator/script.log
{
  "script": {
    "source": "ctx._source.result.input.remove('payload'); if (ctx._source.result.containsKey('transform')) { ctx._source.result.transform.remove('payload') } ctx._source.metadata.watch_history_cleaned = true;",
    "lang": "painless"
  },
  "query": {
    "query_string": {
      "query": "_exists_:(result.input.payload OR result.transform.payload) -_exists_:(metadata.watch_history_cleaned)"
    }
  }
}

If you don’t find an issue with this update_by_query then it is probably because of the old ES version that I tested on and I will retest on a newer one when possible. Looks good to you?

A _exists_ query for result.input.payload did not work for some reason. I left it in the query without having an effect (boolean OR).

About the crash:

[2019-07-22T15:54:46,922][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [gxmneh61] fatal error in thread [elasticsearch[gxmneh61][search][T#1]], exiting
java.lang.OutOfMemoryError: Java heap space
	at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(CompressingStoredFieldsReader.java:209) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
	at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:590) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
	at org.apache.lucene.index.CodecReader.document(CodecReader.java:83) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
	at org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:341) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
	at org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:388) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.search.fetch.FetchPhase.createSearchHit(FetchPhase.java:199) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:156) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:499) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.action.search.SearchTransportService$11.messageReceived(SearchTransportService.java:440) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.action.search.SearchTransportService$11.messageReceived(SearchTransportService.java:437) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:258) ~[?:?]
[..., Java heap dump filled up the /var partition so no further logs could be written]

The crash is not 100 % reproducible so a few days of history are fully updated (payload removed) and this fixes the issue with Kibana taking forever to load the watch history. If someone can confirm that this update_by_query is reliable and the issue is only our environment that I guess that is a solution. The nice aspect of this solution is of course not having to extend Elasticsearch and using an API/curation that already exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants