Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(outputs.nebius_cloud_monitoring): Add 'service' configuration setting #14658

Merged
merged 7 commits into from
Feb 22, 2024

Conversation

abrekhov
Copy link
Contributor

Summary

Internal dev team of Nebius Managed Services came to me and asked to make this parameter configurable.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #14657

@telegraf-tiger telegraf-tiger bot added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins labels Jan 31, 2024
@srebhan srebhan changed the title feat(outputs.nebius_cloud_monitoring): make service parameter configurable again feat(outputs.nebius_cloud_monitoring): Add 'service' configuration setting Jan 31, 2024
@srebhan
Copy link
Member

srebhan commented Jan 31, 2024

@abrekhov can you explain how this parameter can be used?!? It says "normally is should not be changed", so I wonder when to change it and where to find the value to change it to...

@srebhan srebhan self-assigned this Jan 31, 2024
@srebhan srebhan added the cloud Issues or requests around cloud environments label Jan 31, 2024
@abrekhov
Copy link
Contributor Author

abrekhov commented Feb 2, 2024

@abrekhov can you explain how this parameter can be used?!? It says "normally is should not be changed", so I wonder when to change it and where to find the value to change it to...
Hi @srebhan ! We've already discussed this topic in the first PR #13379 (comment)

But few weeks ago DevTeam of Nebius Cloud came to me and said that they want to use telegraf in service monitoring (from the side of the cloud). And they use this service parameter to identify to which service this metrics belongs (e.g. managed postgres cluster, or data-proc service, etc.). So this comment about "normally is should not be changed" is for end customers who want to gather custom metrics from VMs.

Could you please also point me where did I fail with the semantic of PR?

@powersj
Copy link
Contributor

powersj commented Feb 2, 2024

Could you please also point me where did I fail with the semantic of PR?

If your PR only contains a single commit, the semantic checker uses that to check for a semantic message. In your case:

make service configurable again

When you have more than one commit, then the checker will look at the PR title.

@powersj
Copy link
Contributor

powersj commented Feb 2, 2024

@abrekhov - what is the consequence of changing that value? As in, if I am a user and change it to "backend" thinking I am monitoring my own backend, will metrics get rejected? Or show up differently?

@abrekhov
Copy link
Contributor Author

abrekhov commented Feb 2, 2024

@abrekhov - what is the consequence of changing that value? As in, if I am a user and change it to "backend" thinking I am monitoring my own backend, will metrics get rejected? Or show up differently?

As I understand this metrics will be rejected. But if devteam starts to monitor VMs from managed service side and set it to managed-postgres, then it will be processed and user will see such metrics from managed service :

image

@powersj
Copy link
Contributor

powersj commented Feb 6, 2024

@abrekhov,

@srebhan and I are both concerned that this will cause confusion and mistakes by users. However, we would love to enable the backend team to utilize telegraf. Could we instead of a config option have the backend team set an environment variable which changes this setting?

That way the option stays out of the config a user would see and it would prevent possibly setting this incorrectly.

Thanks!

@powersj powersj added the waiting for response waiting for response from contributor label Feb 6, 2024
@abrekhov
Copy link
Contributor Author

@abrekhov,

@srebhan and I are both concerned that this will cause confusion and mistakes by users. However, we would love to enable the backend team to utilize telegraf. Could we instead of a config option have the backend team set an environment variable which changes this setting?

That way the option stays out of the config a user would see and it would prevent possibly setting this incorrectly.

Thanks!

Great idea! Could you provide some example in other plugin where it is implemented?

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Feb 12, 2024
@powersj
Copy link
Contributor

powersj commented Feb 12, 2024

Could you provide some example in other plugin where it is implemented?

Essentially you will want to override the value if an environment variable like NEBIUS_SERVICE is set via os.Getenv("NEBIUS_SERVICE").

So you can update your logic to something like:

if a.service == "" {
    a.service = "custom"
}
if service := os.Getenv("NEBIUS_SERVICE"); service != "" {
    a.service = service
}

Does that help?

@powersj powersj added the waiting for response waiting for response from contributor label Feb 12, 2024
Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Feb 21, 2024
@abrekhov
Copy link
Contributor Author

Thanks!

Could you please explain why lint-linux job is failing over and over again?

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Feb 21, 2024
@powersj
Copy link
Contributor

powersj commented Feb 21, 2024

Ah I had missed that:

Please run "make docs" and push the updated README. Failing CI.

You made two changes:

  1. You updated the sample.conf, but didn't update the corresponding readme. The readme includes an embedded copy of the sample.conf and they need to stay in sync. Running make docs will fix this.
  2. You also updated the yandex cloud monitoring readme, where the sample.conf is embedded and they are different. So again you need to keep both files in sync. Running make docs won't fix this one, as make docs takes the sample.conf and embeds it in the readme.

To be honest, the yandex change doesn't really belong as a part of this PR and I hesitated when I saw that, but if you make the corresponding update to the sample.conf we can include it.

@abrekhov
Copy link
Contributor Author

Ah I had missed that:

Please run "make docs" and push the updated README. Failing CI.

You made two changes:

  1. You updated the sample.conf, but didn't update the corresponding readme. The readme includes an embedded copy of the sample.conf and they need to stay in sync. Running make docs will fix this.
  2. You also updated the yandex cloud monitoring readme, where the sample.conf is embedded and they are different. So again you need to keep both files in sync. Running make docs won't fix this one, as make docs takes the sample.conf and embeds it in the readme.

To be honest, the yandex change doesn't really belong as a part of this PR and I hesitated when I saw that, but if you make the corresponding update to the sample.conf we can include it.

I've executed make docs twice: before previous commit and last one (cleanup). I've deleted changes that not correspond to nebius plugin. Hope that pipeline will succeed.

@telegraf-tiger
Copy link
Contributor

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

🥳 This pull request decreases the Telegraf binary size by -3.20 % for linux amd64 (new size: 221.5 MB, nightly size 228.8 MB)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB RPM TAR GZ ZIP
amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip
arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip
armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip
armhf.deb i386.rpm freebsd_armv7.tar.gz
i386.deb ppc64le.rpm freebsd_i386.tar.gz
mips.deb riscv64.rpm linux_amd64.tar.gz
mipsel.deb s390x.rpm linux_arm64.tar.gz
ppc64el.deb x86_64.rpm linux_armel.tar.gz
riscv64.deb linux_armhf.tar.gz
s390x.deb linux_i386.tar.gz
linux_mips.tar.gz
linux_mipsel.tar.gz
linux_ppc64le.tar.gz
linux_riscv64.tar.gz
linux_s390x.tar.gz

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks @abrekhov!

@srebhan srebhan merged commit 32b8ad5 into influxdata:master Feb 22, 2024
26 checks passed
@github-actions github-actions bot added this to the v1.30.0 milestone Feb 22, 2024
@abrekhov
Copy link
Contributor Author

Awesome! Thanks @abrekhov!

@srebhan @powersj Thank you guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud Issues or requests around cloud environments feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return service parameter to configuration of Nebius Cloud Monitoring
4 participants