Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] A Path to ECS-by-default in Logstash Plugins #11623

Closed
yaauie opened this issue Feb 24, 2020 · 4 comments · Fixed by #13391
Closed

[META] A Path to ECS-by-default in Logstash Plugins #11623

yaauie opened this issue Feb 24, 2020 · 4 comments · Fixed by #13391

Comments

@yaauie
Copy link
Member

yaauie commented Feb 24, 2020

This META issue is a stub, and will be modified to link to sub-issues as they are filed


As the Elastic Stack converges on a common schema (ECS: Elastic Common Schema), it is becoming increasingly useful for the tools in the stack to produce events that align with the schema “so that [users] can better analyze, visualize, and correlate the data represented in their events”.

Although Logstash is a tool for defining arbitrary data transformation pipelines, and the resulting events are not always constrained to the Elastic Stack, many Logstash plugins implicitly place data in fields that either ignore or conflict with ECS. We aim to change these plugins to be ECS-compliant by default when running under the next major version of Logstash, while continuing to support explicitly-given configuration (even when it conflicts with ECS).

We recognize that significant effort has been invested in users’ existing pipelines, that changes in defaults of one plugin can change how a plugin used later in the same pipeline needs to be configured, and that organically-grown pipelines can have enormous complexity. The aim of this effort is to provide a path for migration that minimizes the risk of surprise when upgrading Logstash, while empowering users to adopt ECS (or to avoid doing so) at their own pace independently of their Logstash upgrade plans.

This effort will have two primary facets:

  • A framework within Logstash core to empower plugin maintainers to implement their own ECS-compatibility modes, allowing for users to explicitly opt-in and opt-out via plugin instance, pipeline, and global configuration; AND
  • An effort to provide ECS-compatibility modes using the above framework for all supported and bundled plugins in the 7.x timeframe.

A Framework for ECS-Compatibility Modes

While many plugins will each need to implement their own ECS-compatibility modes that avoid implicitly populating fields that clash with ECS, Logstash core will provide a framework for doing so that:

  • enables each plugin to know the effective ECS-compatibility mode at initialization; AND
  • enables users to opt-in or opt-out of a ECS-compatibility mode at the plugin instance-, pipeline- and global-levels.

This approach is designed to ease the transition from off-by-default to on-by-default in an upcoming major release of Logstash.

A new API will be introduced to Logstash 7.x, in which a ecs_compatibility config option will become available to all plugins, whose default value is powered by a new pipeline-level Logstash setting pipeline.ecs_compatibility and which accepts either a literal disabled or an integer ECS major version number; an adapter will be provided to allow plugin maintainers to use the value of a plugin instance’s ecs_compatibility even when their plugin is run on older Logstash releases that do not provide the pipeline-level setting. It does not modify the Event API that plugins use or otherwise constrain the usage of fields, and implementation details of ECS-Compatibility mode is up to each plugin's maintainer(s).

Once ecs_compatibility mode has been added to a plugin, users will be able to explicitly state their desired behaviour for the individual plugin instance, and when that plugin is running on a Logstash release that includes the pipeline.ecs_compatibility setting, they will be able to control the default behaviour for all plugins in a specific pipeline or globally. By specifying a value at any of these levels, users can lock-in a specific behaviour and eliminate the risk of unexpected behaviour change when upgrading to Logstash 8.x.

Implementing ECS-Compatibility Mode in Plugins

Each plugin that implicitly uses fields that do not align with ECS SHOULD implement a new ECS-compatibility mode along-side their existing implementation, and their documentation SHOULD be updated to cover both sets of behavior. Plugins will be able to determine at registration which mode they are supposed to run in, using the framework defined separately in this document.

To determine the scope of changes needed to implement an ECS-Compatibility mode, a Logstash plugin will need to be categorized based on its implicit behavour to determine the effort to bring their defaults into ECS compliance:

  • Exclusively uses fields that align with ECS (no ECS-Compatibility mode needed)
  • Uses one or more fields that conflict with ECS (ECS-Compatibility mode required)
  • Uses one or more fields that are undefined in the latest ECS and therefore at risk of future conflict (ECS-Compatibility mode recommended, aiming to minimize this risk)

Specifying precisely how each plugin will operate while in ECS-Compatibility mode is beyond the scope of this issue, but the following guidance should be observed:

  • In order to support the breaking-change semantics of this mode being on-by-default in a future major release, all plugins implementing this mode MUST have their changes available prior to 7.last feature freeze; AND
  • When in ECS-Compatibility mode, plugins that encounter explicitly-given configuration that would result in ECS-conflict may either respect or reject the configuration, but MUST NOT override explicit directives to coerce ECS-Comatibility.

A separate meta-issue will outline Elastic's effort to add ECS-Compatibility modes to the plugins that are supported and bundled with Logstash releases (elastic/logstash#11635).

Timeline for Development

  • 7.9:
  • 7.x:
    • Introduce option pipeline.ecs_compatibility to Logstash, defaulting to disabled. At pipeline startup, warn that any value besides disabled that is supplied to pipeline.ecs_compatibility is an unsupported beta feature, since not all plugins supporting this feature will be included until ~7.last
    • Continue introducing updated plugins with ECS-Compatibility modes
  • 7.last:
    • Include updated releases of all supported plugins with ECS-Compatibility changes
    • specific-version values specified to pipeline.ecs_compatibility are considered Generally Available
    • When a plugin that supports ecs_compatibility is initialized, and no value is provided for the plugin instance, the pipeline, or globally, emit a warning to deprecation logger about upcoming change in default value
  • 8.0:
    • Default value of pipeline.ecs_compatibility is v2 (ECS v2 set to ship with Stack 8.0)

Related issues

@jsvd
Copy link
Member

jsvd commented Apr 23, 2020

@yaauie another topic to keep in mind for a later stage integration would be to evaluate the logging of logstash and all its plugins in ECS (with or without the help of https://github.com/elastic/ecs-logging-java)

@yaauie
Copy link
Member Author

yaauie commented Jun 25, 2020

It's becoming apparent that ECS plans to ship breaking changes along-side Stack 8.0 (elastic/ecs#839), and that plugins may stand to benefit from knowing which major version they should target (e.g., the ES Output installs templates, and knowing which ECS-major to target may help in mixed/transition stacks). Thankfully, the ECS Compatibility Support gem is not yet 1.0, so our API is not firmly locked in place.

I believe we can replace the boolean ecs_compatibility proposed above with an ecs_version option to the same effect, flowing through to a similar pipeline-level setting if not specified on the plugin instance.

@yaauie

This comment has been minimized.

@yaauie yaauie closed this as completed Oct 5, 2020
@yaauie
Copy link
Member Author

yaauie commented Oct 5, 2020

Derp, reopening because I didn't mean to close this one 🤦

@yaauie yaauie reopened this Oct 5, 2020
yaauie added a commit that referenced this issue Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in #11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
yaauie added a commit to yaauie/logstash that referenced this issue Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in elastic#11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
yaauie added a commit to yaauie/logstash that referenced this issue Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in elastic#11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
yaauie added a commit that referenced this issue Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in #11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
yaauie added a commit to yaauie/logstash that referenced this issue Nov 4, 2021
yaauie added a commit to yaauie/logstash that referenced this issue Nov 4, 2021
yaauie added a commit that referenced this issue Nov 17, 2021
* ecs: report pipeline's ECS-compatibility with INFO at startup

Because the pipeline-level setting `pipeline.ecs_compatibility` affects the
default behaviour of nearly every plugin in the pipeline, an INFO-level log
message will provide useful hints, especially to our users who upgrade to
Logstash 8 without first reading the breaking changes docs.

For example, when we have two pipelines `old` and `new` whose `pipeline.ecs_compatibility` is `disabled` and `v8` respectively, we would get the following log messages:

> ~~~
> [2021-11-04T18:43:21,810][INFO ][logstash.javapipeline    ] Pipeline `old` is configured with `pipeline.ecs_compatibility: disabled` setting. All plugins in this pipeline will default to `ecs_compatibility => disabled` unless explicitly configured otherwise.
> [2021-11-04T18:43:21,817][INFO ][logstash.javapipeline    ] Pipeline `new` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
> ~~~

* ecs: make v8 the default for 8.0

* ecs: `pipeline.ecs_compatibility` defaults to `v8`

Related: #11623

* doc: temporarily remove deep link from breaking changes doc to fix build
yaauie added a commit to yaauie/logstash that referenced this issue Nov 17, 2021
* ecs: report pipeline's ECS-compatibility with INFO at startup

Because the pipeline-level setting `pipeline.ecs_compatibility` affects the
default behaviour of nearly every plugin in the pipeline, an INFO-level log
message will provide useful hints, especially to our users who upgrade to
Logstash 8 without first reading the breaking changes docs.

For example, when we have two pipelines `old` and `new` whose `pipeline.ecs_compatibility` is `disabled` and `v8` respectively, we would get the following log messages:

> ~~~
> [2021-11-04T18:43:21,810][INFO ][logstash.javapipeline    ] Pipeline `old` is configured with `pipeline.ecs_compatibility: disabled` setting. All plugins in this pipeline will default to `ecs_compatibility => disabled` unless explicitly configured otherwise.
> [2021-11-04T18:43:21,817][INFO ][logstash.javapipeline    ] Pipeline `new` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
> ~~~

* ecs: make v8 the default for 8.0

* ecs: `pipeline.ecs_compatibility` defaults to `v8`

Related: elastic#11623

* doc: temporarily remove deep link from breaking changes doc to fix build

(cherry picked from commit c3e498a)
kares pushed a commit that referenced this issue Nov 18, 2021
* ecs: report pipeline's ECS-compatibility with INFO at startup

Because the pipeline-level setting `pipeline.ecs_compatibility` affects the
default behaviour of nearly every plugin in the pipeline, an INFO-level log
message will provide useful hints, especially to our users who upgrade to
Logstash 8 without first reading the breaking changes docs.

For example, when we have two pipelines `old` and `new` whose `pipeline.ecs_compatibility` is `disabled` and `v8` respectively, we would get the following log messages:

> ~~~
> [2021-11-04T18:43:21,810][INFO ][logstash.javapipeline    ] Pipeline `old` is configured with `pipeline.ecs_compatibility: disabled` setting. All plugins in this pipeline will default to `ecs_compatibility => disabled` unless explicitly configured otherwise.
> [2021-11-04T18:43:21,817][INFO ][logstash.javapipeline    ] Pipeline `new` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
> ~~~

* ecs: make v8 the default for 8.0

* ecs: `pipeline.ecs_compatibility` defaults to `v8`

Related: #11623

* doc: temporarily remove deep link from breaking changes doc to fix build

(cherry picked from commit c3e498a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants