-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[eventLog] prevent log writing when initialization fails #71339
[eventLog] prevent log writing when initialization fails #71339
Conversation
resolves elastic#68309 Previously, if the initialization of the elasticsearch resources failed during initialization, the event logger would still try to write events. Which is somewhat of a catastrophic failure, as typically the logger would try writing to the alias name, but no alias exists, so a new index would be created with the name of the alias. Making it impossible to initialize successfully later until that index was deleted. The core initialization calls already returned success indicators, so this PR just responds to those and prevents the logger from writing to the index if intialization failed.
6a5ffab
to
bd72953
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good overall, but I do wonder if we should be throwing rather than making it a NOOP.
@@ -17,7 +17,7 @@ const createContextMock = () => { | |||
logger: loggingSystemMock.createLogger(), | |||
esNames: namesMock.create(), | |||
initialize: jest.fn(), | |||
waitTillReady: jest.fn(), | |||
waitTillReady: jest.fn(async () => Promise.resolve(true)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
micro nit.
Isn't this the same as... ?
waitTillReady: jest.fn(async () => Promise.resolve(true)), | |
waitTillReady: jest.fn(async () => true), |
If so, it just seems a little easier to understand and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should try putting that back to your suggesting - I had some issues trying to get it to fire, so tried "the long way" so I could debug it. thx!
const success = await esContext.waitTillReady(); | ||
if (!success) { | ||
esContext.logger.debug(`event log did not initialize correctly, event not written`); | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder - we will now be relying on the Event Log not just as a log, but as an operational data source for our product (alert instances over time, their values etc.).
Perhaps failing to index a log event should be a failing operation rather than a NOOP? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
from @gmmorris #71339 (review)
Not sure of the exact context here, but guessing you're referring to the top-level Currently the also from Gidi #71339 (comment)
Fair question, especially if/when we become more dependent on event log data as a source of truth. But a change in the basic assumptions event log was built with. Worth an issue to discuss I think. We might even want to force the event log initialization during start and have Kibana fail to start if the EL initialization fails, I guess as the most severe form of this. I guess I've been trying to keep in my mind that EL provides a lot of useful information, but we need to be careful about being too dependent on it, especially since the user is control over the lifetime of the data, by editing the ILM policy. It's also possible to soft-disable the indexing of events completely, via a config value in EL. |
💚 Build SucceededBuild metrics
History
To update your PR or re-run it, just comment with: |
resolves elastic#68309 Previously, if the initialization of the elasticsearch resources failed during initialization, the event logger would still try to write events. Which is somewhat of a catastrophic failure, as typically the logger would try writing to the alias name, but no alias exists, so a new index would be created with the name of the alias. Making it impossible to initialize successfully later until that index was deleted. The core initialization calls already returned success indicators, so this PR just responds to those and prevents the logger from writing to the index if initialization failed. # Conflicts: # x-pack/plugins/event_log/server/es/context.test.ts
* master: (21 commits) [Maps] 7.9 design improvements (elastic#71563) [ML] Changing all calls to ML endpoints to use internal user (elastic#70487) [eventLog] prevent log writing when initialization fails (elastic#71339) [Observability] landing page always being displayed (elastic#71494) [IM] Address data stream copy feedback (elastic#71615) [Logs UI] Anomalies page dataset filtering (elastic#71110) [data.search.aggs] Remove `use_field_mapping` from top hits agg (elastic#71168) [ML] Anomaly swim lane embeddable navigation and filter actions (elastic#71082) Fixes typo in siem_cloudtrail job description (elastic#71569) Require granted API Keys to have a name (elastic#71623) Update getUsageForCollection (elastic#71609) Only fetch saved elements once (elastic#71310) [SecuritySolution][Resolver] Adding siem index and guarding process ancestry (elastic#71570) [APM] Additional data telemetry changes (elastic#71112) [Visualize] Fix export table for table export links (elastic#71249) [Search] Server side search API (elastic#70446) use inclusive language (elastic#71607) [Security Solution] Hide timeline footer when Resolver is open (elastic#71516) [Index template wizard] Remove shadow and use border for components panels (elastic#71606) [ML] Kibana API endpoint for histogram chart data (elastic#70976) ...
…1662) Previously, if the initialization of the elasticsearch resources failed during initialization, the event logger would still try to write events. Which is somewhat of a catastrophic failure, as typically the logger would try writing to the alias name, but no alias exists, so a new index would be created with the name of the alias. Making it impossible to initialize successfully later until that index was deleted. The core initialization calls already returned success indicators, so this PR just responds to those and prevents the logger from writing to the index if initialization failed.
resolves #68309
Summary
Previously, if the initialization of the elasticsearch resources failed
during initialization, the event logger would still try to write events.
Which is somewhat of a catastrophic failure, as typically the logger would
try writing to the alias name, but no alias exists, so a new index would
be created with the name of the alias. Making it impossible to initialize
successfully later until that index was deleted.
The core initialization calls already returned success indicators, so this
PR just responds to those and prevents the logger from writing to the index
if intialization failed.
Checklist
Delete any items that are not applicable to this PR.