Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-introduce a simplified version of user_agent. #240

Merged
merged 4 commits into from
Dec 7, 2018

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Dec 5, 2018

This is an implementation of my proposal in #235.

You'll notice the version breakdown fields are gone for now. I get the feeling that people either don't care about version breakdowns, or they want to see all version fields they see broken down (not just the UA version). So I'm holding off on adding them for now, unless someone feels strongly about this. This will let us discuss ideas about generalizing the concept of breaking down arbitrary version fields in a separate discussion.

In line with the discussion in #235, for now this is a simple field set, but may grow to cover more concepts:

  • agent type (desktop/mobile browser, email client, feed reader, robot)
  • client architecture
  • rendering engine details
  • producer of the user agent (Chrome, Mozilla)
  • screen dimensions

This kind of reverts #172 and closes #235.

@webmat webmat self-assigned this Dec 5, 2018
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on not adding all the versions but have them in one string as I think Elasticsearch with the prefix query can do most necessary queries already.

So this output of the user_agent is identical to what the user_agent processor does besides the version part? https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-user-agent.html

description: >
Version of the user agent.

- name: device.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this correlate with #238 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It correlates in absolutely no way. Here user_agent.device.name would contain things like "iPad", "iPhone"...

@webmat
Copy link
Contributor Author

webmat commented Dec 6, 2018

@ruflin No, it does not match what IN's user_agent parser gives us. Here's the format it outputs, and the ECS format, side by side:

IN's user_agent result:

"user_agent" : {
  "name" : "Mobile Safari",
  "major" : "12",
  "minor" : "0",
  "device" : "iPhone",
  "os": "iOS 12.1",
  "os_minor" : "1",
  "os_major" : "12",
  "os_name" : "iOS",
}

If we include ECS' reusable os field in, the ECS user_agent from this proposal looks like this:

"user_agent" : {
  "name" : "Mobile Safari",
  "version": "12.0",
  "device": {
    "name": "iPhone",
    "version": "???"
  },
  "os": {
    "name": "iOS",
    "version": "12.1"
  },
  "original": "Mozilla/5.0 (iPhone; CPU iPhone OS 12_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0 Mobile/15E148 Safari/604.1"
}

Takeaways:

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal LGTM.

@ruflin
Copy link
Member

ruflin commented Dec 7, 2018

@webmat As a follow up we should open an Issue in the Github repo so the processor gets adjusted to ECS format and potentially relies by default on user_agent.original for the processing. For backward compatiblity the processor could have an ecs option.

Copy link
Contributor

@MikePaquette MikePaquette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

type: group
fields:

- name: original
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat I still think we should turn indexing off here: #138

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind turning it off, actually. But didn't we get a request in Beats for precisely the reverse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course people can override that, so not a big issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that request was for having it as a text field? Having is a keyword would still not make ti searchable. I wonder now if not index if a user can still add a .text multi-field which is indexed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible, no. But if that person wants indexing, they can totally re-enable keyword indexing on the canonical field, and add a multi-field for text indexing.

If ECS makes it non-indexed, it means that no application that will depend on the official ECS spec can ever perform searches in that field. So whatever users do with it (re-enabling indexing) will not conflict. Because ECS will say it should not be indexed, so the applications can't depend on this being a keyword or a text field.

What I want to avoid at all cost is to make this canonical field text. If we do text here, it must be a multi-field.

@webmat
Copy link
Contributor Author

webmat commented Dec 7, 2018

user_agent.original is now marked as non indexed. This gives us absolute freedom to change it in the future, if we decide to do so. Whereas going from indexed to not is a breaking change.

@webmat webmat merged commit 23ce673 into elastic:master Dec 7, 2018
@webmat webmat deleted the user-agent-returns branch December 7, 2018 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bringing back and cleaning up user_agent
3 participants