Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Filebeat nginx.access to ECS #9081

Merged
merged 16 commits into from
Nov 27, 2018
Merged

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Nov 14, 2018

Caveat

  • This module assumes that the nginx log format is a $http_x_forwarded_for, which is either one IP or more. This means any log line that would contain a unix socket (one possibility, with nginx' default $remote_addr) is already not parsed successfully by this module. This field mapping to ECS does not introduce support for unix sockets.

Renames

  • nginx.access.remote_ip_list => network.forwarded_ip
  • nginx.access.remote_ip => source.ip
  • nginx.access.user_name => user.name
  • nginx.access.response_code => http.response.status_code
  • nginx.access.referrer => http.request.referrer
  • nginx.access.method => http.request.method
  • nginx.access.url => url.original
  • nginx.access.http_version => http.version
  • nginx.access.agent => user_agent.original
  • nginx.access.user_agent.* => user_agent.*
  • nginx.access.geoip.* => source.geo.*

TODO

  • Key 'http.request.referrer' found in event is not documented!
  • Update ECS-migration.yml file
  • Changelog
  • Output user_agent to ECS field names
  • Simplify the code, based on realization that doesn't support unix sockets in the remote address
  • Create field aliases
  • Fix test failure Error loading Elasticsearch template: could not load template. Elasticsearch returned: couldn't load template: couldn't load json. Error: 400 Bad Request: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Invalid [path] value [http.referrer] for field alias [nginx.access.referrer]: an alias must refer to an existing field in the mappings."}]

@ruflin ruflin mentioned this pull request Nov 14, 2018
@webmat webmat requested a review from ruflin November 14, 2018 20:24
@webmat webmat self-assigned this Nov 14, 2018
@webmat webmat added in progress Pull request is currently in progress. module Filebeat Filebeat ecs labels Nov 14, 2018
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add 2 things to the todo list:

  • Update Changelog
  • Update ECS-migration.yml file

filebeat/module/nginx/access/ingest/default.json Outdated Show resolved Hide resolved
filebeat/module/nginx/access/ingest/default.json Outdated Show resolved Hide resolved
filebeat/module/nginx/access/test/test.log-expected.json Outdated Show resolved Hide resolved
@webmat webmat removed the in progress Pull request is currently in progress. label Nov 20, 2018
@webmat
Copy link
Contributor Author

webmat commented Nov 20, 2018

@ruflin Ready for review

@webmat webmat added the review label Nov 20, 2018
@webmat webmat changed the title [WIP] Convert Filebeat nginx.access to ECS Convert Filebeat nginx.access to ECS Nov 20, 2018
@ruflin
Copy link
Member

ruflin commented Nov 21, 2018

I looked into the user_agent problem. The issue is related to the mapping. The user_agent processor tries to create the field "user_agent.os": "Mac OS X 10.12", . But user_agent.os is actually defined as an object which I think is correct. What we need to do is after the user_agent processor run, we need to modify / rename the os field. This is probably the case for all other places where we use the user_agent processor.

@webmat
Copy link
Contributor Author

webmat commented Nov 21, 2018

Ok, thanks for figuring this out. I'll go around your feedback on all other PRs. Then I think I'll fix user_agent in each of them before merging. I'll also create the aliases in each module's field defs in each PR.

@webmat webmat force-pushed the ecs-nginx-access branch 2 times, most recently from 35bc4ed to c971d88 Compare November 23, 2018 16:38
@webmat
Copy link
Contributor Author

webmat commented Nov 23, 2018

@ruflin Ready for final review.

New spin on nginx' IP/socket duality. Since this module groks for an IP list, it means that it doesn't currently support log events with a Unix socket as the source. Fixing this is out of scope for this ECS translation, so I've added a note to #9208 to that effect.

We have exactly one custom field left: nginx.access.body_sent.bytes.

- name: remote_ip
type: keyword
description: >
Client IP address. The first public IP address from the `remote_ip_list` array. If no public IP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if at some stage we regret removing these descriptions. Ok for now, it's also in git.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can alias fields have a description? If so, I don't mind restoring the definitions.

There's a bunch of work needing to happen there anyway for doc generation, right? At this time, these aliases get the module's field name replaced with the ECS field name instead.

So if you want to keep the old field descriptions there, we'll need to have a doc generator that

  • lists the old name
  • mentions that it's now an alias to the new field, with a mention of the new field (even a link to its definition)
  • if a description is provided, provide the description of the deprecated field

@ruflin
Copy link
Member

ruflin commented Nov 23, 2018

It could be that CI failure is related. Let's see what Jenkins thinks.

@webmat
Copy link
Contributor Author

webmat commented Nov 23, 2018

Re-kicked both failing Travis jobs. They both died after a 10m timeout.

@webmat
Copy link
Contributor Author

webmat commented Nov 23, 2018

Jenkins failures only on Windows, for these 3 jobs (test report):

  • test_crawler.Test.test_file_disappear_appear
  • test_harvester.Test.test_close_removed
  • test_registrar.Test.test_clean_removed_with_clean_inactive

This has nothing to do with the nginx module. Probably flaky tests?

@ruflin
Copy link
Member

ruflin commented Nov 23, 2018

@webmat Yes the 3 you mention above are flaky tests and I remember @ph wanted to skip them, so potentially already skipped in master (need to catch on PR's from today).

For the failing travis jobs: The part I worry is that it's both in Filebeat. Often when module tests time out it means one is really slow or hanging. But if Jenkins goes green, all should be good. Probably worth restarting the travis ones.

@webmat
Copy link
Contributor Author

webmat commented Nov 23, 2018

One of the Travis jobs is pretty far along in running the tests, this time, while the other one hangs really early: https://travis-ci.org/elastic/beats/jobs/458876579

@ph
Copy link
Contributor

ph commented Nov 23, 2018

@ruflin @jsoriano beat me to it in #9217 not merged yet. I added the issue to the our sync next week.

Mathieu Martin added 12 commits November 26, 2018 13:11
- nginx.access.referrer => http.request.referrer
- nginx.access.method => http.request.method
- nginx.access.url => url.original
- nginx.access.http_version => http.version
- Copy remote_ip to `source.ip` if it's an IP
- Copy remote_ip to `source.domain` otherwise
- Perform geoIP on `source.ip` when present
- Output geo information to `source.geo`
  - nginx.access.geoip.* => source.geo.*
- `http.request.referrer`
- `url.original`

This is strange. Not sure how I pushed this PR with these failures unsolved...
Adding a dumb `set` for source.ip, otherwise in the painless context, `source` itself doesn't exist yet. And adding this dictionary in painless sounded more painful than using `set`.
@webmat
Copy link
Contributor Author

webmat commented Nov 26, 2018

@ruflin Figured out why stuff was always hanging: was incorrectly aliasing to http.referrer, rather than http.request.referrer.

Looks like all tests will be green here as well. Will merge tomorrow unless something comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants