Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS Compatibility #12305

Merged
merged 15 commits into from
Oct 6, 2020
Merged

ECS Compatibility #12305

merged 15 commits into from
Oct 6, 2020

Conversation

yaauie
Copy link
Member

@yaauie yaauie commented Oct 5, 2020

Implements a plugin ecs_compatibility option, whose default value is powered by the pipeline-level setting pipeline.ecs_compatibility, in line with the proposal in #11623

To do so, a significant amount of work within the PluginFactory needed to be done to ensure that pipeline-level settings were available to a Logstash plugin before its initialize was executed, including the maintaining of context for codecs that are routinely cloned.

This PR includes and supersedes the work of #12259

Once approved, I plan to backport to 7.x, then to follow up with a subsequent PR targeting master that tackles related breaking changes targeted at 8.0.

@yaauie
Copy link
Member Author

yaauie commented Oct 5, 2020

jenkins test this again please

This prepares for an upcoming change that will expose the PluginFactory to
running plugins.
 - Default value supplied by `pipeline.ecs_compatibility`
 - Warns when default used (default will change in next major)
 - Warns when value other than `disabled` is used (BETA: by opting in, users
   can accidentally consume breaking changes in a minor release of Logstash,
   such as one that includes an upgrade to a plugin that introduces ECS
   compatibility logic)
@@ -92,6 +92,8 @@ def flush(&block)

public
def clone
return self.class.new(params)
LogStash::Plugins::Contextualizer.initialize_plugin(execution_context, self.class, params).tap do |klone|
klone.metric = @metric if klone.instance_variable_get(:@metric).nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 seems new, help me understand why do we now copy the @metric over?

Copy link
Member Author

@yaauie yaauie Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resolves a bug in which codecs that are cloned undercount their metrics, since only the original is "connected" to the metrics store and the copies all end up with a null implementation. Inputs like TCP and File clone their codecs per connection/file, which is what allows codecs like multiline to safely retain state across invocations.

In reality, most of the metrics that are recorded are handled by the delegator that wraps the codec, so this only really addresses codecs that go out of their way to record additional metrics.

break @ecs_compatibility unless @ecs_compatibility.nil?

pipeline = execution_context.pipeline
pipeline_settings = pipeline && pipeline.settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, this taking care of manual plugin instantiation (nil pipeline) e.g. in tests LS::Inputs::Twitter.new params, right?

codec_klass = LogStash::Plugin.lookup("codec", value)
codec_instance = LogStash::Plugins::Contextualizer.initialize_plugin(execution_context, codec_klass)

params[name.to_s] = LogStash::Codecs::Delegator.new(codec_instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and these instantiated on demand previously?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a weird one.

When a pipeline definition included a codec that parsed by the LIR as a plugin definition (e.g., codec => rubydebug {}), the codec was instantiated by the plugin factory in the normal way, but when the LIR parsed a codec as a string (quoted or bareword. e.g., codec => rubydebug), the instantiation was done by the input/output in Config::Mixin::validate_value(value, validator) which as a class method does not have access to the instance's execution context.

It looks like I forgot to include the removal of the string handling there, which should now never happen because we intercept it here.

Copy link
Contributor

@kares kares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, changes look clean - commits are easy to follow and specs are 🌓 class!

Takes a while to get used to LogStash::Plugins::Contextualizer#initialize_plugin compared to having a factory method on a "usual" place, it's been clarified that the change is blocked by Java unit tests.

The inline comments are non blocking and are mostly me trying to better understand the code-base.

…:codec)`

Options with `:validate => :codec` provided with String values are intercepted
and instantiated by `Config::Mixin#config_init(Hash)` _before_ being validated
with `Config::Mixin::DSL::validate_value(Object, :codec)`.
@yaauie yaauie merged commit 08d1ef0 into elastic:master Oct 6, 2020
yaauie added a commit to yaauie/logstash that referenced this pull request Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in elastic#11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
@yaauie yaauie mentioned this pull request Oct 6, 2020
5 tasks
yaauie added a commit to yaauie/logstash that referenced this pull request Oct 6, 2020
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in elastic#11623:

In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.

For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).

This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.

* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
@kares kares mentioned this pull request Jul 7, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants