-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.7] Description for host.name updated (FQDN issue) #2122
[8.7] Description for host.name updated (FQDN issue) #2122
Conversation
Is there a way to go ahead and document that we'd like this recommendation to be required in ECS 9.0 as a breaking change? Or where are we tracking future breaking changes to ECS? |
@joshdover @mitodrummer have you checked in with anyone from observability to make sure they're aware of this maybe @ruflin? |
Can we have a more detailed description in the PR what this change exactly means? Is it a breaking change? Linking to a discussion is useful to provide additional details but lets make sure all the necessary details are in the PR itself. |
In general I like the idea of moving to As part of this change, we should come up with a recommended migration path instead of just putting the change out there and hope everyone will figure it out. |
It's been discussed to some extent in the referenced discussion, but agree we should make this clear here. The issue today is that this data is likely already incorrect and confusing for users if there are multiple hosts with the same hostname, but on different domains. In such a case you could have 2 or more different hosts reporting on the same time series, which will basically make the reported data meaningless. That said, the new time series problem is a real one and changing this will disrupt a user's ability to analyze data from a specific host over a time period that includes the switch. In terms of how we could mitigate that, I see a few options:
My understanding is that most O11y data usually have fairly short data retention periods (up to 90 days) so this will be a temporary problem. Security data is more likely to be retained for very long periods of time and my understanding from the SIEM input here is that this is already a problem that customers would prefer to see solved sooner than later. In practice, we also haven't started using TSDB time series mappings in our integrations yet, and this change would likely ship in the same version we start doing so (8.7). This may be a nice lucky timing which could help mitigate the concern. Overall, I prefer we move forward with this proposal but include something like (3) for users that need to do further lookbacks. |
Don't think this applies to logs and also there are quite a few metrics uses cases out there that have much longer retention. Comparing 1 and 3 above: 3 requires centralised logic to do the change. It works well in the context where Fleet is in place but for self managed setups, this will be tricky. I keep coming back to 1 as the most promising option. My ideal scenario:
Maybe we can go with "Existing installations changes on minor upgrade if config flag is not set". It means a potential breaking change for a user but the user at least could opt out. |
@joshdover and @ruflin What else is needed for approval on this change to the ECS description? |
@norrietaylor Seems like we agree. But it all feels a bit theoretical and it would be great if someone could play around with this change and see what effects it has. |
I think the implementation of detecting of new vs. existing install would be non-trivial and I wonder if we could get away with making it completely opt-in from the Fleet UI until the next major version. I imagine this could be an advanced setting on the Agent Policy or global Fleet Settings tab. |
@joshdover @ruflin sorry to open this can again. I just want to get clarity on how we would achieve what is being suggested above for all the different products and installation methods we have: (A) Recognizing a New Install
(B) Toggle switch or feature flag
I want to avoid the complexity that the logic above may introduce, equally really don't want to break existing customers. That's why we originally recommended a new ECS field. As Josh said if the user is relying on host.name, queries are broken for them already. So I am wondering if we bite the bullet now, avoid the complexity, and introduce a breaking change and communicate the hell out of it to the user base OR alternatively wait until 9.0 to introduce this. @MikePaquette would love to get your opinion on this. From the original discussions the preference seemed to be to use the same field. |
I don't think automatically determining what the user wants will be that simple, since they probably want the same behavior for all agents, including ones that were enrolled or configured recently. They may even expect the same behavior if they create a new cluster (eg. a test and prod). If our heuristic gets any of this wrong, we will have a non-obvious breaking change.
We would need to have a new config flag somewhere in *beat.yml and elastic-agent.yml. For Beats, probably the general settings would make sense. For agent, probably somewhere under |
Endpoint would need to be passed this config value as well |
@nimarezainia what do you think about the proposal above? We need to align on this as some development work has already begun. To summarize again, the proposal would be:
If we have alignment on this, we need to schedule more tasks across the various projects to enable this. |
I don't think we should reuse I don't have all of the context, but from the comments in this PR it seems clear to me that what we want from this field is something differently entirely. We want a universal string representation of the originating host of data, which is not what |
Practically, I think introducing a new field for what I believe is the intended usage of this field (from a query/read perspective - split/count/agg data by unique host) is going to require a lot of work across integrations, alerting, Fleet, and user-written queries. I don't think this work would even have an end, as I expect the habit of reaching for If allowing users to populate their own custom name is an intended use case as well, then I'd propose we:
This is essentially the same proposal as before, but we won't plan to make any breaking changes in the future. The only use case this proposal wouldn't cover is users who want a customizable |
I agree with you @joshdover but would like to hear from @epixa |
@joshdover I have no objection to what you just proposed |
I think we are all aligned on the implementation proposal now and it still matches the content of this PR itself. @elastic/ecs we still need approval on this PR to be able to merge it into the spec. |
FYI - I posted my 2 cents here: elastic/kibana#150239 (comment) The damage has been done by making host.name not fqdn by default with the Elastic Agent 8.7.0 change above. (Speaking from a windows environment) |
Updates the description for the host.name field to encourage the use of lowercase FQDN as a value for host.name.
Taken from the linked issue where the proposal for the change was submitted:
All Elastic ECS producers should populate the host.name field with the lowercased FQDN from here forward.
The ECS definition for host.name should be updated to recommend the use of lowercase FQDN. e.g., "Name of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host."
@ebeahan please advise if you'd like this change to go through the RFC process.
The description used was taken from this discussion: elastic/beats#1070 (comment)
make test
? ymake
and committed those changes? y