-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
packages/windows/data_stream/powershell_operations: don't split tokens on hyphen #1931
Conversation
a59d507
to
78c1cf6
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
@@ -105,6 +105,10 @@ | |||
example: "50d2dbda-7361-4926-a94d-d9eadfdb43fa" | |||
- name: script_block_text | |||
type: text | |||
analyzer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't support analyzers defined in fields files:
�[36mkibana_1 |�[0m {"type":"log","@timestamp":"2021-10-18T00:00:08+00:00","tags":["error","plugins","fleet"],"pid":1228,"message":"Error: Error installing windows 1.2.4: illegal_argument_exception: [illegal_argument_exception] Reason: composable template [logs-windows.powershell_operational] template after composition with component templates [logs-windows.powershell_operational@custom, .fleet_component_template-1] is invalid\n at ensureInstalledPackage (/usr/share/kibana/x-pack/plugins/fleet/server/services/epm/packages/install.js:193:11)\n at runMicrotasks (<anonymous>)\n at processTicksAndRejections (internal/process/task_queues.js:95:5)\n at async Promise.all (index 0)\n at PackagePolicyService.create (/usr/share/kibana/x-pack/plugins/fleet/server/services/package_policy.js:133:33)\n at createPackagePolicyHandler (/usr/share/kibana/x-pack/plugins/fleet/server/routes/package_policy/handlers.js:109:27)\n at Router.handle (/usr/share/kibana/src/core/server/http/router/router.js:163:30)\n at handler (/usr/share/kibana/src/core/server/http/router/router.js:124:50)\n at exports.Manager.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)\n at Object.internals.handler (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)\n at exports.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)\n at Request._lifecycle (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:370:32)\n at Request._execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:279:9)"}
Did you try using the ingest pipeline to approach this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't, but I think a keyword analyzer here and lowercase and split on this pattern in the ingest should work. Nope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin Do you think we need to support analyzers or is there any workaround available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analyzers do appear to work given that we allow settings
declared the data stream manifest.yml (source: https://github.com/elastic/package-spec/blob/a0687c0dc7a9da3fc540bdc7c5df2d7d84ae6713/versions/1/data_stream/manifest.spec.yml#L174-L176).
I tested this and it passed the system test.
diff --git a/packages/windows/data_stream/powershell_operational/fields/fields.yml b/packages/windows/data_stream/powershell_operational/fields/fields.yml
index 2049ba44..ae35dff3 100644
--- a/packages/windows/data_stream/powershell_operational/fields/fields.yml
+++ b/packages/windows/data_stream/powershell_operational/fields/fields.yml
@@ -105,10 +105,7 @@
example: "50d2dbda-7361-4926-a94d-d9eadfdb43fa"
- name: script_block_text
type: text
- analyzer:
- powershell:
- type: pattern
- pattern: "[\\W&&[^-]]+"
+ analyzer: powershell_script_analyzer
description: >
Text of the executed script block.
diff --git a/packages/windows/data_stream/powershell_operational/manifest.yml b/packages/windows/data_stream/powershell_operational/manifest.yml
index 08b887b3..8eca400c 100644
--- a/packages/windows/data_stream/powershell_operational/manifest.yml
+++ b/packages/windows/data_stream/powershell_operational/manifest.yml
@@ -1,5 +1,13 @@
type: logs
title: Windows Powershell/Operational logs
+elasticsearch:
+ index_template:
+ settings:
+ analysis:
+ analyzer:
+ powershell_script_analyzer:
+ type: pattern
+ pattern: '[\W&&[^-]]+'
streams:
- input: winlog
template_path: winlog.yml.hbs
The logs-windows.powershell_operational@settings
component template is created as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec issue created: elastic/package-spec#238
78c1cf6
to
3bcf0f7
Compare
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
I think we would want the same analyzer applied to these as well. Right?
|
3bcf0f7
to
d4fe240
Compare
The other question is whether the search analyzer should also be provided. |
I think it does need a
{
"tokens" : [
{
"token" : "invoke-webrequest",
"start_offset" : 1,
"end_offset" : 18,
"type" : "word",
"position" : 0
},
{
"token" : "-uri",
"start_offset" : 19,
"end_offset" : 23,
"type" : "word",
"position" : 1
},
{
"token" : "https",
"start_offset" : 25,
"end_offset" : 30,
"type" : "word",
"position" : 2
},
{
"token" : "aka",
"start_offset" : 33,
"end_offset" : 36,
"type" : "word",
"position" : 3
},
{
"token" : "ms",
"start_offset" : 37,
"end_offset" : 39,
"type" : "word",
"position" : 4
},
{
"token" : "pscore6-docs",
"start_offset" : 40,
"end_offset" : 52,
"type" : "word",
"position" : 5
},
{
"token" : "links",
"start_offset" : 55,
"end_offset" : 60,
"type" : "word",
"position" : 6
},
{
"token" : "href",
"start_offset" : 61,
"end_offset" : 65,
"type" : "word",
"position" : 7
}
]
}
{
"tokens" : [
{
"token" : "invoke",
"start_offset" : 0,
"end_offset" : 6,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "webrequest",
"start_offset" : 7,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 1
}
]
} |
d4fe240
to
2d4d50d
Compare
2d4d50d
to
64ed53a
Compare
…s on hyphen Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
64ed53a
to
d8cd9c1
Compare
…s on hyphen (elastic#1931) Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
What does this PR do?
The change replaces the simple tokenizer with a custom tokenizer that splits on word boundaries that do not include hyphen.
Checklist
changelog.yml
file.manifest.yml
file to point to the latest Elastic stack release (e.g.^7.13.0
).Author's Checklist
How to test this PR locally
Related issues
Screenshots