feat(outputs): Add Zabbix plugin #13739

adrianlzt · 2023-08-08T11:53:29Z

Add output plugin to support sending metrics to Zabbix (https://www.zabbix.com/).

This output plugin handle sending metrics as traps, generating LLD data to feed discovery rules and is able to send autoregistration requests.

Required for all PRs

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format

Supersedes PR #3966

srebhan

Thanks @adrianlzt for your contribution! I have some comments in the code...

plugins/outputs/zabbix/README.md

plugins/outputs/zabbix/zabbix.go

adrianlzt · 2023-08-14T19:27:01Z

@srebhan let me know if that's ok to squash all commits

srebhan

Thanks for the nice update @adrianlzt! There are a few minor comments and two larger ones, namely the removal of elementsMatch (replace by a hash comparison) and replacing assert with require in the tests.

Looking forward to the next round! Please note that I will likely not find time to do another round of reviews this week and next week...

plugins/outputs/zabbix/autoregister_test.go

plugins/outputs/zabbix/lld.go

plugins/outputs/zabbix/lld_test.go

plugins/outputs/zabbix/utils.go

srebhan

Thanks for the nice update @adrianlzt! Sorry for the late response but even I need some holidays... ;-)

I have some more comments and one suggestion for the LLD implementation. What do you think?

plugins/outputs/zabbix/README.md

plugins/outputs/zabbix/lld.go

srebhan · 2023-08-30T19:05:21Z

plugins/outputs/zabbix/lld.go

+	// Send empty LLDs for the LLDs that were sent the last time but not this time.
+	for key := range zl.previousReceivedData {
+		emptyDataValuesJSON, err := json.Marshal(map[string]interface{}{"data": []interface{}{}})
+		if err != nil {
+			zl.log.Warnf("Marshaling to JSON empty data Zabbix format: %v", err)
+
+			continue
+		}
+
+		// Add the LLD m
+		m := metric.New(
+			lldName,
+			map[string]string{
+				hostTag: key.Hostname,
+			},
+			map[string]interface{}{
+				key.LLDKey: emptyDataValuesJSON,
+			},
+			time.Now(),
+		)
+		metrics = append(metrics, m)
+	}


This block does not what the comment says. It unconditionally sends an empty value for each previously seen data item. Shouldn't it check somewhere if we also do have data in the current metric set!?!?

Improved command and added an explicit test on c86330c.
In this loop, zl.previousReceivedData, only stores LLDs not seen this time.

plugins/outputs/zabbix/lld.go

plugins/outputs/zabbix/zabbix.go

srebhan · 2023-09-01T08:36:44Z

@adrianlzt any thoughts on my LLD suggestion? I can also push it to your repo if you want (and allow it)!?

srebhan · 2023-09-11T08:07:04Z

@adrianlzt any update on the lld part?

plugins/outputs/zabbix/README.md

plugins/outputs/zabbix/zabbix.go

plugins/outputs/zabbix/utils.go

plugins/outputs/zabbix/lld.go

plugins/outputs/zabbix/README.md

plugins/outputs/zabbix/utils.go

plugins/outputs/zabbix/zabbix.go

Hipska

Did a large review for the tests

plugins/outputs/zabbix/README.md

plugins/outputs/zabbix/utils.go

plugins/outputs/zabbix/autoregister_test.go

plugins/outputs/zabbix/zabbix_test.go

srebhan · 2023-09-26T09:23:18Z

@adrianlzt any update to @Hipska's comments?

adrianlzt · 2023-09-26T10:25:06Z

Sorry, quite busy right now. I will try to get back to this by the end of this week.

adrianlzt · 2023-10-16T09:22:13Z

I have not forgotten this. I will come back to this in a couple of weeks. Sorry

srebhan · 2023-10-20T09:25:46Z

@adrianlzt can you please at least send a keep-alive every second week so we know you are still looking after this!?

adrianlzt · 2023-11-02T09:28:17Z

Still working on it.

Hipska · 2023-11-02T10:02:17Z

Please mark it as a draft in between..

influxdata#13739 (comment) I have made some modifications to be able to change the hostTag and to clear the current map after each push: zl.current = make(map[uint64]lldInfo, len(zl.current)) This implementation fails with the test "TestAddAndPush/one metric changes the value of the tag, it should send the new value and not send and empty lld" It tries to send an empty LLD if the content of the LLD changes, but that is not correct. Example of what the current implementation is doing: disk host=foo,disk=sda free=10 --push (LLD {#DISK}=sda) disk host=foo,disk=sdb free=10 --push (LLD {#DISK}=sdb) + empty LLD That last empty LLD should not be sent, as it will be received after the {#DISK}=sdb and tell zabbix the are no values for that LLD. The problem resides in how hash are created and handled in the Push() function. The hash changes if the content of data changes, but does not take into account that the same lld (hostname+key) with different values should avoid sending an empty LLD.

influxdata#13739 (comment)

adrianlzt · 2024-01-23T16:18:42Z

@powersj , that's correct. In the normal zabbix-server - zabbix-agent configuration, the low level discovery is configured to run each N minutes. Once the zabbix-server has received the discovery and created the items, it ask to zabbix-agent for those new items.

So zabbix-agent will not send the data until the LLD has been executed once. Telegraf will send metrics but will not be accepted till LLD has been received.

Hipska · 2024-01-23T16:40:37Z

Should I squash all comits into the first one?

If you are going to do a force push with all commits again, then yes, that might be better indeed. Now it is polluting the PR history a bit 😛

Add output plugin to support sending metrics to Zabbix (https://www.zabbix.com/). This output plugin handle sending metrics as traps, generating LLD data to feed discovery rules and is able to send autoregistration requests.

adrianlzt · 2024-01-23T16:45:55Z

Sorry, was a rebase to fix the go.mod conflict.

powersj · 2024-01-23T21:22:08Z

@adrianlzt,

Thanks for the updates and clarifications. I've added a few comments and need to further understand this experience:

Telegraf will send metrics but will not be accepted till LLD has been received.

What does the user see until the LDD is run? Are metrics reported as successfully written by telegraf? Does the agent cache those metrics? Is there a limit on the cache?

I am asking as we had a user report this similar behavior with another output, which has a second level of cache and was confused when telegraf said things are writing successfully, but the metrics were not getting to the server.

srebhan · 2024-01-24T08:41:31Z

@adrianlzt along the lines of @powersj, can we force an LLD update before sending the first batch?

adrianlzt · 2024-01-24T10:34:07Z

The user, from the zabbix-web perspective, won't see any items related to that LLD until it is send.

The typical example is disk monitoring. Zabbix define a discovery rule that will receive which disks are present in the server . Once the lld data is received it will create the items for each disk, for example:

telegraf.disk.free[sda]
telegraf.disk.free[sdb]

From the telegraf point of view, the metrics send to those items will fail until they are created. The problem is telegraf could not known which metrics are being rejected because zabbix response give just a total count of passed and failed metrics:

[{"response":"success","info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000046"}]

One option could be to send one metric at a time, but that will be completely inefficient.

So, currently, telegraf send metrics that are ignored by zabbix.

The option to send first the LLD and then metrics could be an option, but I see a timing problem.
How much telegraf is going to wait queueing those metrics?

Currently, telegraf gather info for lld_send_interval before sending the LLD.
That is used to be able to form the LLD completely.
What I mean is that two different inputs could be aggregated in the same LLD. Probably not the more common scenario, but still an option.

Example, two different inputs, with different gather intervals that generate these metrics:

disk,host=foo,name=sda time=123
disk,host=foo,name=sda time=145

If telegraf send the LLD just before the sda metric is generated, zabbix will think that only this "sda" is present. If telegraf happens to be just restarted, and zabbix knew that disks sda and sdb exists, that LLD will make zabbix delete, or mark to delete, the items of the disk sdb.

I think the current behaviour is ok taking into account the differences between how zabbix-agent and telegraf works.

srebhan · 2024-01-24T10:42:20Z

So Zabbix will delete the series (i.e. the previously collected data) if LLD omits it?

Then we do have a timing problem anyway. What if we get a rush in of metrics and they are presented as multiple batches to the output in a way that the LLD interval is between them?

telegraf-tiger · 2024-01-24T11:14:57Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1%)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_arm64.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz	windows_i386.zip
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz

adrianlzt · 2024-01-24T11:19:59Z

Zabbix has a configuration option to decide what to do with lost data in the LLD. The default is to keep them 30 days.

If that value is set to 0, ther are deleted instantly.

I don't see your example. The output will collect info for 10' (default value), that should be enough to get at least one round of all metrics.
If a metric is configured with a higher interval, it could be marked to be deleted.

I think that is a corner case and it will be normally not pose a problem, as zabbix won't delete the resources inmediatly. But yes, tunning the gather interval, lld interval and keep lost resources period coud lead to items deleted and recreated.

srebhan · 2024-01-24T11:54:42Z

But in this sense, sending an LLD as soon as we do see new data shouldn't be a problem, is it?
We could additionally save the state of the output consisting of the LLD cache content...

adrianlzt · 2024-01-24T12:00:59Z

Will be marking items as to be deleted (if the zabbix config is to keep them for a while). Also, sending LLDs to zabbix is more costly than sending metrics. LLDs require querying the database, so it's better to reduce the number of LLDs. A very recent version of this plugin sent the LLD immediately, but was causing too much load in Zabbix and in the db. El mié, 24 ene 2024 a las 12:54, Sven Rebhan ***@***.***>) escribió:

…

But in this sense, sending an LLD as soon as we do see new data shouldn't be a problem, is it? — Reply to this email directly, view it on GitHub <#13739 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYWPGGNYLAIPTNWHM2QUHLYQDZA5AVCNFSM6AAAAAA3ILH7WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXHE4DANJWGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

powersj · 2024-01-24T13:37:54Z

@adrianlzt - thanks again for the clarifications.

Can we get you to please add a short explanation in README about this behavior? I worry that this is quite different then the experience with other plugins or other output documentation and I do not want users to get caught off guard. Something to the effect of:

Users need to keep in mind that the metrics will fail to send until the Zabbix Server has received a low-level discovery (LLD) with the metrics. Sending LLD to Zabbix is a heavy-weight process and is only done at the interval per the lld_send_interval setting.

It is possible that a user of Zabbix already knows this, but it would still be good to have this written down so we can reference it.

After that and renaming key config option I believe we are good to go.

Thanks!

adrianlzt · 2024-01-24T14:18:33Z

Added in 335decf

telegraf-tiger · 2024-01-24T15:18:27Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1%)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_arm64.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz	windows_i386.zip
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz

powersj

Thanks for driving this to completion!

adrianlzt · 2024-01-24T15:30:35Z

6 years! Time to celebrate 🥳

srebhan · 2024-01-24T20:18:39Z

@adrianlzt thanks for your persistence and patience!

adrianlzt force-pushed the feature/zabbix_output branch from 698bb40 to f21b421 Compare August 8, 2023 12:11

srebhan reviewed Aug 8, 2023

View reviewed changes

srebhan self-assigned this Aug 8, 2023

srebhan added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins labels Aug 8, 2023

srebhan changed the title ~~feat: add Zabbix output plugin~~ feat(outputs): Add Zabbix plugin Aug 8, 2023

srebhan added the new plugin label Aug 8, 2023

adrianlzt force-pushed the feature/zabbix_output branch 2 times, most recently from c180c6d to 6a7663d Compare August 9, 2023 07:34

adrianlzt marked this pull request as draft August 11, 2023 07:07

adrianlzt marked this pull request as ready for review August 14, 2023 19:25

srebhan reviewed Aug 14, 2023

View reviewed changes

srebhan reviewed Aug 30, 2023

View reviewed changes

adrianlzt force-pushed the feature/zabbix_output branch from 43a1c70 to d8613db Compare August 31, 2023 07:55

Hipska suggested changes Sep 12, 2023

View reviewed changes

Hipska reviewed Sep 18, 2023

View reviewed changes

plugins/outputs/zabbix/utils.go Outdated Show resolved Hide resolved

plugins/outputs/zabbix/zabbix.go Outdated Show resolved Hide resolved

Hipska suggested changes Sep 18, 2023

View reviewed changes

Hipska added the waiting for response waiting for response from contributor label Oct 16, 2023

telegraf-tiger bot removed the waiting for response waiting for response from contributor label Oct 20, 2023

srebhan added the waiting for response waiting for response from contributor label Oct 20, 2023

telegraf-tiger bot removed the waiting for response waiting for response from contributor label Nov 2, 2023

adrianlzt added a commit to datadope-io/telegraf that referenced this pull request Jan 23, 2024

feat(output.zabbix): applied patch by srebhan

f2b7ad4

influxdata#13739 (comment)

feat: add Zabbix output plugin

8b60240

Add output plugin to support sending metrics to Zabbix (https://www.zabbix.com/). This output plugin handle sending metrics as traps, generating LLD data to feed discovery rules and is able to send autoregistration requests.

adrianlzt force-pushed the feature/zabbix_output branch from 4110c94 to 8b60240 Compare January 23, 2024 16:45

fallback mechanism to get the zabbix host

49c7ecd

adrianlzt force-pushed the feature/zabbix_output branch from 12e2d7e to 49c7ecd Compare January 24, 2024 10:34

Rename prefix to key_prefix

a579028

LLD behaviour docs extended

9f55b47

adrianlzt force-pushed the feature/zabbix_output branch from 335decf to 9f55b47 Compare January 24, 2024 14:38

powersj approved these changes Jan 24, 2024

View reviewed changes

powersj merged commit c8e12fa into influxdata:master Jan 24, 2024
26 checks passed

github-actions bot added this to the v1.30.0 milestone Jan 24, 2024

Hipska mentioned this pull request Jan 24, 2024

Add zabbix output plugin #470

Closed

hhiroshell pushed a commit to hhiroshell/telegraf that referenced this pull request Feb 1, 2024

feat(outputs): Add Zabbix plugin (influxdata#13739)

a00143f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(outputs): Add Zabbix plugin #13739

feat(outputs): Add Zabbix plugin #13739

adrianlzt commented Aug 8, 2023

srebhan left a comment

adrianlzt commented Aug 14, 2023

srebhan left a comment

srebhan left a comment

srebhan Aug 30, 2023

adrianlzt Sep 11, 2023

srebhan commented Sep 1, 2023

srebhan commented Sep 11, 2023

Hipska left a comment

srebhan commented Sep 26, 2023

adrianlzt commented Sep 26, 2023

adrianlzt commented Oct 16, 2023

srebhan commented Oct 20, 2023

adrianlzt commented Nov 2, 2023

Hipska commented Nov 2, 2023

adrianlzt commented Jan 23, 2024

Hipska commented Jan 23, 2024

adrianlzt commented Jan 23, 2024

powersj commented Jan 23, 2024

srebhan commented Jan 24, 2024

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024

telegraf-tiger bot commented Jan 24, 2024

Artifact URLs

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024 •

edited

Loading

adrianlzt commented Jan 24, 2024 via email

powersj commented Jan 24, 2024

adrianlzt commented Jan 24, 2024

telegraf-tiger bot commented Jan 24, 2024

Artifact URLs

powersj left a comment

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024

feat(outputs): Add Zabbix plugin #13739

feat(outputs): Add Zabbix plugin #13739

Conversation

adrianlzt commented Aug 8, 2023

Required for all PRs

srebhan left a comment

Choose a reason for hiding this comment

adrianlzt commented Aug 14, 2023

srebhan left a comment

Choose a reason for hiding this comment

srebhan left a comment

Choose a reason for hiding this comment

srebhan Aug 30, 2023

Choose a reason for hiding this comment

adrianlzt Sep 11, 2023

Choose a reason for hiding this comment

srebhan commented Sep 1, 2023

srebhan commented Sep 11, 2023

Hipska left a comment

Choose a reason for hiding this comment

srebhan commented Sep 26, 2023

adrianlzt commented Sep 26, 2023

adrianlzt commented Oct 16, 2023

srebhan commented Oct 20, 2023

adrianlzt commented Nov 2, 2023

Hipska commented Nov 2, 2023

adrianlzt commented Jan 23, 2024

Hipska commented Jan 23, 2024

adrianlzt commented Jan 23, 2024

powersj commented Jan 23, 2024

srebhan commented Jan 24, 2024

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024

telegraf-tiger bot commented Jan 24, 2024

Artifact URLs

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024 • edited Loading

adrianlzt commented Jan 24, 2024 via email

powersj commented Jan 24, 2024

adrianlzt commented Jan 24, 2024

telegraf-tiger bot commented Jan 24, 2024

Artifact URLs

powersj left a comment

Choose a reason for hiding this comment

adrianlzt commented Jan 24, 2024

srebhan commented Jan 24, 2024

srebhan commented Jan 24, 2024 •

edited

Loading