Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default_field option to fields.yml #14341

Merged
merged 5 commits into from
Oct 31, 2019

Conversation

andrewkroh
Copy link
Member

The number of fields in the Elasticsearch index template's settings.index.query.default_field option has grown over time, and is now greater than 1024 in Filebeat (Elastic licensed version). This causes queries to Elasticsearch to fail when a list of fields is not specified because there is a default limit of 1024 in Elasticsearch.

This adds a new setting to fields.yml called default_field whose value can be true/false (defaults to true). When true the text/keyword fields are added to the default_field list (as was the behavior before this change). And when set to false the field is omitted from the default_field list.

This adds a test for every beat to check if the default_field list contains more than 1000 fields. The limit is a little less than 1024 because fields.* is in the default_field list already and at query time that wildcard will be expanded and count toward the limit.

Fixes #14262

@andrewkroh andrewkroh added bug review Filebeat Filebeat libbeat needs_backport PR is waiting to be backported to other branches. labels Oct 30, 2019
@andrewkroh andrewkroh requested review from a team as code owners October 30, 2019 19:59
@andrewkroh andrewkroh force-pushed the bugfix/default-field-exclusion branch from dcd77f4 to 28a1e5b Compare October 30, 2019 20:01
@andrewkroh
Copy link
Member Author

I'm expecting a test failure for Filebeat because it has over 1024 fields. Then I'll push a fix to make it go green.

@andrewkroh andrewkroh requested a review from a team as a code owner October 30, 2019 20:46
@andrewkroh andrewkroh force-pushed the bugfix/default-field-exclusion branch from 438b8d1 to aef5c0e Compare October 30, 2019 20:49
The number of fields in the Elasticsearch index template's `settings.index.query.default_field` option has grown over time, and is now greater than 1024 in Filebeat (Elastic licensed version). This causes queries to Elasticsearch to fail when a list of fields is not specified because there is a default limit of 1024 in Elasticsearch.

This adds a new setting to fields.yml called `default_field` whose value can be true/false (defaults to true). When true the text/keyword fields are added to the `default_field` list (as was the behavior before this change). And when set to false the field is omitted from the default_field list.

This adds a test for every beat to check if the default_field list contains more than 1000 fields. The limit is a little less than 1024 because `fields.*` is in the default_field list already and at query time that wildcard will be expanded and count toward the limit.

Fixes elastic#14262
@andrewkroh andrewkroh force-pushed the bugfix/default-field-exclusion branch from aef5c0e to 33c3573 Compare October 30, 2019 21:31
@andrewkroh andrewkroh requested review from ph and ruflin October 30, 2019 21:44
@andrewkroh
Copy link
Member Author

I trimmed out the latest additions from the Zeek module and now x-pack/filebeat is showing 939 and queries in Kibana are working.

$ go test -v .
=== RUN   TestSystem
--- PASS: TestSystem (0.00s)
=== RUN   TestTemplate
=== RUN   TestTemplate/default_field_length
--- PASS: TestTemplate (0.10s)
    --- PASS: TestTemplate/default_field_length (0.10s)
        template.go:83: Filebeat template has 939 fields in default_field.
PASS

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Glad to have the main_test.go now for all Beats.

Can't approve the zeek changes, would be good to have someone from the team doing this.

auditbeat/main_test.go Show resolved Hide resolved
libbeat/tests/system/template/template.go Show resolved Hide resolved
if !ok {
t.Fatalf("settings.index.query.default_field value has an unexpected type: %T", v)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about also executing a elasticsearch query here in addition to catch potential other errors. Say for example ES decides to lower it to 512?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would need to be added somewhere else IMO. I put the test here so that it could be executed quickly as a unit test and not have an external dependency on ES.

This could be put into system test. Each beat should probably call setup to install its template and ILM policy, index a document, then perform a query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this would fit well here: https://github.com/elastic/beats/blob/master/filebeat/tests/system/test_base.py#L30 Perhaps someone in the team from @urso could follow up with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh @ruflin I've created #14352 to make a change in libbeat to be more proactive to detect that error.

Copy link
Contributor

@ph ph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, @ruflin conerning zeek changes its @elastic/observability correct?

@ruflin
Copy link
Member

ruflin commented Oct 31, 2019

@ph I think that falls under SIEM.

switch field.Type {
case "", "keyword", "text":
addToDefaultFields(&field)
if field.DefaultField == nil || *field.DefaultField {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that the nil check above means a DefaultField==nil behaves as an implicit default to true. It's my understanding that the check in line 59 prevents this from ever being nil, but still looks dangerous to me. I would turn the nil check into an error.

This really doesn't matter much unless we expect that some day the default for 'default_field' will be false, so feel free to ignore.

@andrewkroh andrewkroh merged commit 9f21b96 into elastic:master Oct 31, 2019
andrewkroh added a commit to andrewkroh/beats that referenced this pull request Nov 22, 2019
* Add default_field option to fields.yml

The number of fields in the Elasticsearch index template's `settings.index.query.default_field` option has grown over time, and is now greater than 1024 in Filebeat (Elastic licensed version). This causes queries to Elasticsearch to fail when a list of fields is not specified because there is a default limit of 1024 in Elasticsearch.

This adds a new setting to fields.yml called `default_field` whose value can be true/false (defaults to true). When true the text/keyword fields are added to the `default_field` list (as was the behavior before this change). And when set to false the field is omitted from the default_field list.

This adds a test for every beat to check if the default_field list contains more than 1000 fields. The limit is a little less than 1024 because `fields.*` is in the default_field list already and at query time that wildcard will be expanded and count toward the limit.

Fixes elastic#14262

* Exclude new zeek datasets from default_field list

(cherry picked from commit 9f21b96)
@andrewkroh andrewkroh added v7.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Nov 22, 2019
jorgemarey pushed a commit to jorgemarey/beats that referenced this pull request Jun 8, 2020
* Add default_field option to fields.yml

The number of fields in the Elasticsearch index template's `settings.index.query.default_field` option has grown over time, and is now greater than 1024 in Filebeat (Elastic licensed version). This causes queries to Elasticsearch to fail when a list of fields is not specified because there is a default limit of 1024 in Elasticsearch.

This adds a new setting to fields.yml called `default_field` whose value can be true/false (defaults to true). When true the text/keyword fields are added to the `default_field` list (as was the behavior before this change). And when set to false the field is omitted from the default_field list.

This adds a test for every beat to check if the default_field list contains more than 1000 fields. The limit is a little less than 1024 because `fields.*` is in the default_field list already and at query time that wildcard will be expanded and count toward the limit.

Fixes elastic#14262

* Exclude new zeek datasets from default_field list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many fields in template's default_fields
5 participants