-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract device type from user agent info #69322
Conversation
Pinging @elastic/es-core-features (Team:Core/Features) |
@shahzad31, is this related to #65057? I think we'd be happy to have the ability to extract device types, but we'd want it to be usable across multiple use cases and stable so that device types wouldn't change as the implementation evolved. |
Yes it's related to #65057 yes definitely goal is to have stable implementation. I mean it might miss some use cases where it will mark those as "Others" or miss new UA string being added just like any other UA parser. But current implementation does make sure it aligns with UA parser we are using to extract other info, so yeah it won't change over time for current parser. |
This relates to elastic/uptime#296 |
Performed a simple performance comparison on this branch vs master
i don't see any performance different, in both branches, time recorded varies between 550ms and 700ms, i ran this comparison about 20 times, ~10 times on each branch. To run test i used Intellijidea IDE an example run |
Thanks, @shahzad31. I think that test would be significantly affected by the test setup and teardown operations. I ran several tests of my own that did more to isolate processor execution time and there was a pretty consistent performance cost of ~10% for extracting the device type. I think that's small enough that we don't need an option to disable it. I'll start on the code review for this shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shahzad31, I made an initial review pass and have some suggestions noted below. Once those are addressed, I'll do at least one more pass on it.
...les/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/DeviceTypeParser.java
Outdated
Show resolved
Hide resolved
...les/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/DeviceTypeParser.java
Outdated
Show resolved
Hide resolved
...les/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/DeviceTypeParser.java
Outdated
Show resolved
Hide resolved
modules/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/UserAgentParser.java
Outdated
Show resolved
Hide resolved
modules/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/UserAgentParser.java
Outdated
Show resolved
Hide resolved
modules/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/UserAgentParser.java
Outdated
Show resolved
Hide resolved
modules/ingest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/UserAgentParser.java
Outdated
Show resolved
Hide resolved
...est-user-agent/src/test/java/org/elasticsearch/ingest/useragent/UserAgentProcessorTests.java
Outdated
Show resolved
Hide resolved
...est-user-agent/src/test/java/org/elasticsearch/ingest/useragent/UserAgentProcessorTests.java
Outdated
Show resolved
Hide resolved
...ngest-user-agent/src/main/java/org/elasticsearch/ingest/useragent/IngestUserAgentPlugin.java
Show resolved
Hide resolved
Fyi, you can see the reason for the |
@shahzad31, thanks for your work and iteration here. It looks good now so I've merged it in and will get it backported for the next 7.x release. |
elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match.
elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match.
elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match.
* tests/system: adapt to new API response In elastic/kibana#95146 the response structure for listing APM agent central config changed. Update system tests to match. * tests/system: add user_agent.device.type elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match.
* tests/system: adapt to new API response In elastic/kibana#95146 the response structure for listing APM agent central config changed. Update system tests to match. * tests/system: add user_agent.device.type elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match. (cherry picked from commit 0e09aa6)
Add field definition for new user agent field added in elastic/elasticsearch#69322 Regenerate Filebeat test files with this new field. Should fix Filebeat builds.
Add field definition for new user agent field added in elastic/elasticsearch#69322 (cherry picked from commit 6454736)
Add field definition for new user agent field added in elastic/elasticsearch#69322 (cherry picked from commit 6454736)
* tests/system: fix system tests (#5037) * tests/system: adapt to new API response In elastic/kibana#95146 the response structure for listing APM agent central config changed. Update system tests to match. * tests/system: add user_agent.device.type elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match. (cherry picked from commit 0e09aa6) * user_agent.device.type isn't in 7.x yet * make update * systemtest: revert approvals changes Co-authored-by: Andrew Wilkins <axw@elastic.co> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* tests/system: adapt to new API response In elastic/kibana#95146 the response structure for listing APM agent central config changed. Update system tests to match. * tests/system: add user_agent.device.type elastic/elasticsearch#69322 added support for extracting device types to the user_agent ingest processors. Update approvals to match.
Adds a device type into user agent processor
Matching Algorithm
Process is pretty simple, based on OS and browser extracted via UA parser lib, this PR creates few simple patterns based on those , correct device type is matched,
one pattern for example to match Desktop devices is this
- regex: '^(Windows$|Windows NT$|Mac OS X|Linux$|Chrome OS|Fedora$|Ubuntu$)'
so if extracted OS name is one of these, there are high chances that device is desktop. Same goes for mobile OS, along with this it tries to match browser names as well and correlates both results.
For bot, it looks for following words in any place
- regex: 'Bot|bot|spider|Spider|Crawler|crawler|AppEngine-Google'
Same goes for tablet etc
Eample:
In dev tools:
result:
Real user data analysis
Ran anaylysis on data from elastic.co using rum-agent which is deployed on observability clusters,
there were unique 60019 user agent strings in the data, extracted those strings and pushed them into es using this PR user agent ingest pipeline
and this PR was able to match mora than 99% successfully , here is the analysis in lens
Note: This doesn't represent traffic, it represents ratio of extracted categories from uniquer UA strings
Testing
Tested by building it via kibana
yarn es source
and tested via devtools as desribed above in example and screenshot