-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Historical rules packages PoC #145851
Conversation
7c343fc
to
b43fad9
Compare
46b2732
to
aaef9a1
Compare
aaef9a1
to
17933eb
Compare
17933eb
to
b6cc24a
Compare
💔 Build FailedFailed CI StepsTest Failures
Metrics [docs]Async chunks
Saved Objects .kibana field count
Unknown metric groupsESLint disabled in files
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: cc @xcrzx |
…ects (#148141) **Resolves: #147695, #148174 **Related to: #145851, #137420 ## Summary This PR improves the stability of the Fleet packages installation process with many saved objects. 1. Changed mappings of the `installed_kibana` and `package_assets` fields from `nested` to `object` with `enabled: false`. Values of those fields were retrieved from `_source`, and no queries or aggregations were performed against them. So the mappings were unused, while during the installation of packages containing more than 10,000 saved objects, an error was thrown due to the nested field limitations: ``` Error installing security_detection_engine 8.4.1: The number of nested documents has exceeded the allowed limit of [10000]. This limit can be set by changing the [index.mapping.nested_objects.limit] index level setting. ``` 2. Improved the deletion of previous package assets by switching from sending multiple `savedObjectsClient.delete` requests in parallel to a single `savedObjectsClient.bulkDelete` request. Multiple parallel requests were causing the Elasticsearch cluster to stop responding for some time; see [this ticket](#147695) for more info. **Before** ![Screenshot 2022-12-28 at 11 09 35](https://user-images.githubusercontent.com/1938181/209816219-ade6dd0a-0d56-4acc-929e-b88571f0fe81.png) **After** ![Screenshot 2022-12-28 at 13 56 44](https://user-images.githubusercontent.com/1938181/209816209-16c69922-4ae2-4589-9aa4-5a28050037f4.png)
Closing this PR as both data structures were thoroughly tested. |
Summary
Related to: #137420
I've run a number of tests measuring the performance of the two versions of historical rules packages:
On a relatively small total number of historical versions (< 10000 total versions, or 10 versions per rule for 1000 rules), the composite structure outperforms the flat one when installing the rules package:
The difference is visible but not so significant in those numbers. However, things become ugly for the
flat
stricture when we increase the total version number to 15-20k.Maximum number of items in a nested field
The first problem that becomes visible is related to the maximum number of items in a nested field. It has already been discussed here and could be easily overcome by adding
enabled: false
to the mappings for theinstalled_kibana
field:kibana/x-pack/plugins/fleet/server/saved_objects/index.ts
Lines 261 to 267 in ab8dd04
Refresh ran out of slots and forced a refresh
After fixing the above error, the rules package becomes installable, but its installation starts to fail randomly, making Kibana unresponsive for some time. The problem seems to come from the Elasticsearch level. Console logs show dozens of warnings similar to this:
During that time, all requests to Kibana fail with
But after some time Elasticsearch cluster recovers by itself.
Created a ticket for the Fleet team: #147695
All shards failed
Another common error associated with the flat package installation is the following is shards failure. It occurs randomly and is not always easily reproduced:
After this failure, Elasticsearch doesn't recover by itself, and Kibana responds with:
Response timeout
Sometimes flat package installation just fails with a timeout.
Testing instructions
For reference: https://www.elastic.co/guide/en/integrations-developer/current/build-a-new-integration.html
curl -XPOST http://elastic:changeme@localhost:5601/kbn/internal/detection_engine/rules/prebuilt/_install_test_assets -d '{"num_versions_per_rule":10}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'
cd fleet-packages/detection-rules-flat && elastic-package build --skip-validation
cd fleet-packages/detection-rules-composite && elastic-package build --skip-validation
elastic-package stack up --services package-registry
docker cp <container id>:/etc/ssl/package-registry/ca-cert.pem fleet-packages
kibana.dev.yml
:xpack.fleet.registryUrl: https://localhost:8080
NODE_EXTRA_CA_CERTS=./fleet-packages/ca-cert.pem yarn start
http://localhost:5601/kbn/app/integrations/browse
. You should find there two detection rules packages:Prebuilt detection rules (composite)
andPrebuilt detection rules (flat)
curl http://elastic:changeme@localhost:5601/kbn/api/fleet/epm/packages/security_rules_flat/8.3.2 -d '{"force":true}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'
curl http://elastic:changeme@localhost:5601/kbn/api/fleet/epm/packages/security_rules_composite/8.3.2 -d '{"force":true}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'
Conclusion
According to this PoC, the composite rule structure looks more stable. However, as outlined in another PoC, the flat structure provides more benefits when it comes to business logic implementation and overall looks more future-proof. My suggestion would be to fix the current performance issues that are associated with the flat structure and use it as a foundation for the rule customization work.