Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow routing for integrations that are not input packages #6255

Closed
gsantoro opened this issue May 18, 2023 · 15 comments · Fixed by #6340
Closed

Allow routing for integrations that are not input packages #6255

gsantoro opened this issue May 18, 2023 · 15 comments · Fixed by #6340
Assignees

Comments

@gsantoro
Copy link
Contributor

gsantoro commented May 18, 2023

We want to add the following settings to enable rerouting to a different dataset or namespace

elasticsearch.dynamic_dataset: true
elasticsearch.dynamic_namespace: true

In this case, we want to add those settings to the following list of datastream:

  • kubernetes.container_logs
  • system.syslog

Those settings are already allowed at datastream level in the package-spec.

Data streams that are already very specific, such as nginx.access are not candidates for now.

Similar work has already been done for input packages at #5989.

@gsantoro gsantoro self-assigned this May 18, 2023
@gsantoro
Copy link
Contributor Author

A list of other possible datastream to consider:

  • activemq.log
  • auditd.log
  • aws.cloudwatch_logs
  • aws.ec2_logs
  • kafka.log
  • docker.container_logs

and integrations that maybe should be input packages instead. If we convert any of these to input package instead, they would inherit those changes by default

  • aws_logs.generic
  • azure_blob_storage.generic
  • google_cloud_storage.generic
  • gcp_pubsub.generic
  • http_endpoint.generic
  • httpjson.generic
  • kafka_log.generic
  • tcp.generic
  • udp.generic

@gsantoro
Copy link
Contributor Author

hello @felixbarny and @ruflin can we agree on the "final" list of datastreams to apply the above flags to?

So far my suggested list of packages to edit includes:

  • kubernetes.container_logs
  • system.syslog
  • activemq.log
  • auditd.log
  • aws.cloudwatch_logs
  • aws.ec2_logs
  • kafka.log
  • docker.container_logs

Do we want to modify all the above in a single PR?

Should we consider the below packages as well? These are packages that should probably be input packages but they are not yet.

  • aws_logs.generic
  • azure_blob_storage.generic
  • google_cloud_storage.generic
  • gcp_pubsub.generic
  • http_endpoint.generic
  • httpjson.generic
  • kafka_log.generic
  • tcp.generic
  • udp.generic

@felixbarny
Copy link
Member

Do we want to modify all the above in a single PR?

I think so, yes. It seems to be too much overhead to create individual PRs for that that each just add two additions to the manifest.

Should we consider the below packages as well?

Yes!

@ruflin
Copy link
Collaborator

ruflin commented May 25, 2023

I have a slightly different view on the above. I don't think we should extend existing packages if these are not already input packages. For kubernetes, lets rather create an input package for logs-kubernetes-* and add the routing rules and permissions. If we change existing packages, adding routing rules there is a breaking change.

@felixbarny
Copy link
Member

We're not talking about adding routing rules here. This issue is just about adding flags to allow routing so that user can add their routing rules if they want. If we don't do that, they'll hit API key permission issues.

@ruflin
Copy link
Collaborator

ruflin commented May 25, 2023

Had a quick chat about this with @felixbarny to clarify things on my end. We agree to progress with the above proposal . We need to have a follow up discussion on having a new input package for example for syslog or logs-kubernetes-* which will contain default routing rule but we should not mix it into this conversation.

One concern I had is that it we add the flags to kubernetes.container_logs that even if the dataset is not used, the API key generated will be for logs-*-*. But the assumption is, this shouldn't be the case as Fleet only has the API keys in scope that are enabled. Lets double check that this is the case.

One question for the second list: What is the effort of converting these to an input package instead of adding the configs?

@gsantoro
Copy link
Contributor Author

my thoughts.

Had a quick chat about this with @felixbarny to clarify things on my end. We agree to progress with the above proposal . We need to have a follow up discussion on having a new input package for example for syslog or logs-kubernetes-* which will contain default routing rule but we should not mix it into this conversation.

Should I create an issue about this?

One concern I had is that it we add the flags to kubernetes.container_logs that even if the dataset is not used, the API key generated will be for logs--. But the assumption is, this shouldn't be the case as Fleet only has the API keys in scope that are enabled. Lets double check that this is the case.

I can have a look. I'm not sure yet are the API keys accessible in the UI or the API but I will investigate this

One question for the second list: What is the effort of converting these to an input package instead of adding the configs?

I assume we would create a separate input package with a non conflicting name instead of modifying the current package. is that correct? About how long will it take, I am not sure. It looks easy but there might be unknown unknowns that I am not aware of.

@ruflin
Copy link
Collaborator

ruflin commented May 25, 2023

Should I create an issue about this?

Would be great.

I assume we would create a separate input package with a non conflicting name instead of modifying the current package.

No, the idea was to convert it to an input package. I know we have done that for some packages as input packages didn't exist for some time. @ishleenk17 might know more here.

@ishleenk17
Copy link
Contributor

ishleenk17 commented May 25, 2023

No, the idea was to convert it to an input package. I know we have done that for some packages as input packages didn't exist for some time. @ishleenk17 might know more here.

There have been both scenarios. New input packages have been created (jolokia_input, statsd_input, sql_input, prometheus_input) since these didn't exist before.

For some scenarios, existing were converted to input packages. Eg: Journald. If there is an existing integration, converting it to input package would take lesser effort.

@ruflin
Copy link
Collaborator

ruflin commented May 26, 2023

@ishleenk17 Is there any guide on how to convert a package to an input package? Anything specific to worry about?

@ishleenk17
Copy link
Contributor

ishleenk17 commented May 26, 2023

@ishleenk17 Is there any guide on how to convert a package to an input package? Anything specific to worry about?

There is no guide as such. An initial document which i referred to when I built the first input package.

You can see jolokia input package for reference.

While converting a package to input package below are the numero-uno points to be kept in mind.

  • Follow the correct structure of the input package folder.
  • Keep only the generic field mappings (which you expect will be a default mapping for the user). For other fields leave the flexibility to the user to map
  • Readme is handled differently than integrations as of now. It's not auto generated. (Refer the input package)
  • Ensure dataset is also taken as input from user in handlebars file
  • Input package runs on a single input option. So in case your integration has multiple datastream (with different inputs), you will have to choose one of those input for your input package.

@gsantoro
Copy link
Contributor Author

@ruflin , I created two follow up issues (linked above) from the discussion that we had here.

@lalit-satapathy
Copy link
Collaborator

We're not talking about adding routing rules here. This issue is just about adding flags to allow routing so that user can add their routing rules if they want. If we don't do that, they'll hit API key permission issues.

Will this be eventually added to all integration packages?

@felixbarny
Copy link
Member

We were thinking to add elevated API key permissions to allow for routing to all input packages (see #5989) and to select other packages (this issue). Most of the packages listed in this issue ought to be input packages but it's not in scope of this particular issue to convert them.

@felixbarny felixbarny linked a pull request Jun 13, 2023 that will close this issue
4 tasks
@felixbarny
Copy link
Member

Implemented in #6340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants