Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest] Fleet should allow to enable elastic-agent monitoring #62141

Closed
nchaulet opened this issue Apr 1, 2020 · 23 comments · Fixed by #63598
Closed

[Ingest] Fleet should allow to enable elastic-agent monitoring #62141

nchaulet opened this issue Apr 1, 2020 · 23 comments · Fixed by #63598
Assignees
Labels
Ingest Management:alpha1 Group issues for ingest management alpha1 Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@nchaulet
Copy link
Member

nchaulet commented Apr 1, 2020

Description

We should allow the user to configure elastic-agent monitoring, I think this should be done in the configuration pages?

Are we okay with using the default output for monitoring for alpha1? @ph

We should be able to add this to the generated config

outputs:
  default:
    type: elasticsearch
    api_key: VuaCfGcBCdbkQm-e5aOx:ui2lp2axTNmsyakw9tvNnw
    hosts: ["localhost:9200"]
    ca_sha256: "7lHLiyp4J8m9kw38SJ7SURJP4bXRZv/BNxyyXkCcE/M="
    # Not supported at first
    queue:
      type: disk

  long_term_storage:
    type: elasticsearch
    api_key: VuaCfGcBCdbkQm-e5aOx:ui2lp2axTNmsyakw9tvNnw
    hosts: ["localhost:9200"]
    ca_sha256: "7lHLiyp4J8m9kw38SJ7SURJP4bXRZv/BNxyyXkCcE/M="
    queue:
      type: disk

  monitoring:
    type: elasticsearch
    api_key: VuaCfGcBCdbkQm-e5aOx:ui2lp2axTNmsyakw9tvNnw
    hosts: ["localhost:9200"]
    ca_sha256: "7lHLiyp4J8m9kw38SJ7SURJP4bXRZv/BNxyyXkCcE/M="

settings.monitoring:
  use_output: monitoring
@nchaulet nchaulet added the Team:Fleet Team label for Observability Data Collection Fleet team label Apr 1, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@ruflin ruflin added the Ingest Management:alpha1 Group issues for ingest management alpha1 label Apr 2, 2020
@ph
Copy link
Contributor

ph commented Apr 14, 2020

@nchaulet yes, we are OK with that, for the first release.

@ph
Copy link
Contributor

ph commented Apr 14, 2020

@nchaulet would they have the same credentials and permissions as the default output?

@nchaulet nchaulet self-assigned this Apr 14, 2020
@nchaulet
Copy link
Member Author

nchaulet commented Apr 14, 2020

@hbharding we need to be able to enable agent monitoring when we create/update an agent config, do you have anything in mind on how we show that (copy)?

I implemented it like this for now under advanced options
Screen Shot 2020-04-15 at 1 07 01 PM

@nchaulet
Copy link
Member Author

@ph unless you see a reason to not, yes it will be the same credentials and permissions.

@nchaulet
Copy link
Member Author

@ph I am trying to test it locally does the agent monitoring is already implemented agent side? and should I be able to see metrics or anything in kibana?
I am sending this config

id: 6d7a4780-7f21-11ea-b609-3923382c3c4a
revision: 1
settings.monitoring:
  use_output: default
outputs:
  default:
    type: elasticsearch
    hosts:
      - 'http://localhost:9200'
    api_key: 'REDACTED'
datasources: []

@ph
Copy link
Contributor

ph commented Apr 15, 2020

@michalpristas can you help @nchaulet on this?

@michalpristas
Copy link

michalpristas commented Apr 15, 2020

to enable monitoring you need to specify this:


settings.monitoring:
  # enabled turns on monitoring of running processes
  enabled: true
  # enables log monitoring
  logs: true
  # enables metrics monitoring
  metrics: true

it will look for use_output, this is correct, this needs to be of type elasticsearch.

Specifying output is not enough to get this working, you need to enable monitoring explicitly

@nchaulet
Copy link
Member Author

@michalpristas Do you think it make sense to allow user to enable just logs or metrics from kibana? or we should always enable everything?

@michalpristas
Copy link

i think it makes sense, in some way user has this behavior now as we provide filebeat and metricbeat searately. programmatically it was a natural way as these are separate entities.
from user perspective we should double check with @ruflin or @mostlyjason

@mostlyjason
Copy link
Contributor

What do you think about making Agent monitoring an integration package? It could be included by default similar to the system package, but the user could use our existing data source editor to control the settings. We could also offer assets like dashboards, alerts, etc.

@nchaulet
Copy link
Member Author

@mostlyjason it seems an interesting idea, but not sure if it's implementable on the agent side and if we are not mixing concerns here

@ruflin
Copy link
Contributor

ruflin commented Apr 17, 2020

From an agent config perspective, I think what @michalpristas suggests above is correct. It is a setting which makes it very easy to set up. If we should allow to only turn on logs or metrics I'm torn. Lets have both enabled by default for now, perhaps even skip the separate configs and only have enabled.

On the backend side of the Agent, I could see that this actually generates a data source and is run as everything else (I would assume, this even simplifies things).

On the fleet end, I assume it will be a an enable / disable flag on the agent config. It will look very similar to enable system monitoring but I think from Fleet perspective it is not a data source that the user can configure / modify like he does for the system one.

The last part is how does the setup of the stack work. I like the idea of this becoming a package so we have the same setup mechanism as everywhere else. It might be a hidden package which is installed by default.

@michalpristas What are the indices this data is written into at the moment?

@michalpristas
Copy link

@ruflin this is what we were talking about with nicolas recently. atm we dont have any indice specified, so if we want to have something predeterministic, calculated based on output/input/pipeline/agent/etc we need to agree on this.

@ruflin
Copy link
Contributor

ruflin commented Apr 17, 2020

@michalpristas Could you make a suggestion? My current assumption would be that it is rather static.

@michalpristas
Copy link

i think it depends on how we want user to perceive it. wehave a notion of pipelines, separation by output and monitoring per this pipeline. but i assume we dont want user to know/see this and user should see behavior of the platform as a whole in this case it make sense to make it static. maybe (metricbeat|filebeat)-fleet or (metricbeat|filebeat)-fleet-monitoring or something similar

@hbharding
Copy link
Contributor

@nchaulet @ruflin @mostlyjason depending on what we decide, here's a mockup showing how I might show this in the flyout. I'm curious what others think about exposing these options outside of "advanced options". On one hand, these check boxes let the user know and control what data the agent collects (which I think is important to be upfront about), but on the other hand, additional form fields could increase the cognitive load / time required to create an agent config the first time. I'm in favor of exposing these. They'd be enabled by default, so user has to opt-out, and it's not any additional clicks to quickly create an agent config.

In addition to this, we'd want to expose these options on the agent config settings page.

image

@ruflin
Copy link
Contributor

ruflin commented Apr 20, 2020

I think collecting system metrics and agent metrics / logs should be separated. Collecting logs and metrics is definitively and advanced option.

I like the idea to not have it at all here for now and just make it a global option? So either you collect it from all or no agents?

@nchaulet
Copy link
Member Author

@ruflin I think it make sense to have it per config, as there is probably an noticeable overhead to run monitoring, and you want probably be able to do it per config.

I implemented like this currently #63598

@hbharding
Copy link
Contributor

Okay. It makes sense to me as an advanced option if it's not something most users need to know about. Since @nchaulet says this has the potential to add "noticeable overhead", it probably makes sense for these options to be disabled by default? Since this is "hidden" under advanced options, an opt-in experience feels appropriate to me here.

image

@ruflin
Copy link
Contributor

ruflin commented Apr 21, 2020

I would argue it should be turned on by default to get the full experience by default. If someone wants to opt-out he can. This also ties into the part around metrics and logs and if we really want to allow to enable / disable them separately. If we build a UI or dashboards for these, having part disabled would not be nice.

@ruflin
Copy link
Contributor

ruflin commented Apr 21, 2020

To not block this change with discussion: Lets get monitoring working and then figuring it out in detail where we need the config and how it is triggered.

@hbharding
Copy link
Contributor

👍 re: opt-in vs opt-out. I change my mind. For alpha, lets just enable it by default so we can get the data in / give the full experience. We can revisit this later based on customer use cases / feed back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingest Management:alpha1 Group issues for ingest management alpha1 Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants