-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(outputs): Add Zabbix plugin #13739
Conversation
698bb40
to
f21b421
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adrianlzt for your contribution! I have some comments in the code...
c180c6d
to
6a7663d
Compare
@srebhan let me know if that's ok to squash all commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice update @adrianlzt! There are a few minor comments and two larger ones, namely the removal of elementsMatch
(replace by a hash comparison) and replacing assert
with require
in the tests.
Looking forward to the next round! Please note that I will likely not find time to do another round of reviews this week and next week...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice update @adrianlzt! Sorry for the late response but even I need some holidays... ;-)
I have some more comments and one suggestion for the LLD implementation. What do you think?
plugins/outputs/zabbix/lld.go
Outdated
// Send empty LLDs for the LLDs that were sent the last time but not this time. | ||
for key := range zl.previousReceivedData { | ||
emptyDataValuesJSON, err := json.Marshal(map[string]interface{}{"data": []interface{}{}}) | ||
if err != nil { | ||
zl.log.Warnf("Marshaling to JSON empty data Zabbix format: %v", err) | ||
|
||
continue | ||
} | ||
|
||
// Add the LLD m | ||
m := metric.New( | ||
lldName, | ||
map[string]string{ | ||
hostTag: key.Hostname, | ||
}, | ||
map[string]interface{}{ | ||
key.LLDKey: emptyDataValuesJSON, | ||
}, | ||
time.Now(), | ||
) | ||
metrics = append(metrics, m) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block does not what the comment says. It unconditionally sends an empty value for each previously seen data item. Shouldn't it check somewhere if we also do have data in the current metric set!?!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved command and added an explicit test on c86330c.
In this loop, zl.previousReceivedData
, only stores LLDs not seen this time.
43a1c70
to
d8613db
Compare
@adrianlzt any thoughts on my LLD suggestion? I can also push it to your repo if you want (and allow it)!? |
@adrianlzt any update on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a large review for the tests
@adrianlzt any update to @Hipska's comments? |
Sorry, quite busy right now. I will try to get back to this by the end of this week. |
I have not forgotten this. I will come back to this in a couple of weeks. Sorry |
@adrianlzt can you please at least send a keep-alive every second week so we know you are still looking after this!? |
Still working on it. |
Please mark it as a draft in between.. |
influxdata#13739 (comment) I have made some modifications to be able to change the hostTag and to clear the current map after each push: zl.current = make(map[uint64]lldInfo, len(zl.current)) This implementation fails with the test "TestAddAndPush/one metric changes the value of the tag, it should send the new value and not send and empty lld" It tries to send an empty LLD if the content of the LLD changes, but that is not correct. Example of what the current implementation is doing: disk host=foo,disk=sda free=10 --push (LLD {#DISK}=sda) disk host=foo,disk=sdb free=10 --push (LLD {#DISK}=sdb) + empty LLD That last empty LLD should not be sent, as it will be received after the {#DISK}=sdb and tell zabbix the are no values for that LLD. The problem resides in how hash are created and handled in the Push() function. The hash changes if the content of data changes, but does not take into account that the same lld (hostname+key) with different values should avoid sending an empty LLD.
@powersj , that's correct. In the normal zabbix-server - zabbix-agent configuration, the low level discovery is configured to run each N minutes. Once the zabbix-server has received the discovery and created the items, it ask to zabbix-agent for those new items. So zabbix-agent will not send the data until the LLD has been executed once. Telegraf will send metrics but will not be accepted till LLD has been received. |
If you are going to do a force push with all commits again, then yes, that might be better indeed. Now it is polluting the PR history a bit 😛 |
Add output plugin to support sending metrics to Zabbix (https://www.zabbix.com/). This output plugin handle sending metrics as traps, generating LLD data to feed discovery rules and is able to send autoregistration requests.
4110c94
to
8b60240
Compare
Sorry, was a rebase to fix the go.mod conflict. |
Thanks for the updates and clarifications. I've added a few comments and need to further understand this experience:
What does the user see until the LDD is run? Are metrics reported as successfully written by telegraf? Does the agent cache those metrics? Is there a limit on the cache? I am asking as we had a user report this similar behavior with another output, which has a second level of cache and was confused when telegraf said things are writing successfully, but the metrics were not getting to the server. |
@adrianlzt along the lines of @powersj, can we force an LLD update before sending the first batch? |
The user, from the zabbix-web perspective, won't see any items related to that LLD until it is send. The typical example is disk monitoring. Zabbix define a discovery rule that will receive which disks are present in the server . Once the lld data is received it will create the items for each disk, for example:
From the telegraf point of view, the metrics send to those items will fail until they are created. The problem is telegraf could not known which metrics are being rejected because zabbix response give just a total count of passed and failed metrics:
One option could be to send one metric at a time, but that will be completely inefficient. So, currently, telegraf send metrics that are ignored by zabbix. The option to send first the LLD and then metrics could be an option, but I see a timing problem. Currently, telegraf gather info for Example, two different inputs, with different gather intervals that generate these metrics:
If telegraf send the LLD just before the I think the current behaviour is ok taking into account the differences between how zabbix-agent and telegraf works. |
12e2d7e
to
49c7ecd
Compare
So Zabbix will delete the series (i.e. the previously collected data) if LLD omits it? Then we do have a timing problem anyway. What if we get a rush in of metrics and they are presented as multiple batches to the output in a way that the LLD interval is between them? |
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
But in this sense, sending an LLD as soon as we do see new data shouldn't be a problem, is it? |
Will be marking items as to be deleted (if the zabbix config is to keep
them for a while).
Also, sending LLDs to zabbix is more costly than sending metrics. LLDs
require querying the database, so it's better to reduce the number of LLDs.
A very recent version of this plugin sent the LLD immediately, but was
causing too much load in Zabbix and in the db.
El mié, 24 ene 2024 a las 12:54, Sven Rebhan ***@***.***>)
escribió:
… But in this sense, sending an LLD as soon as we do see new data shouldn't
be a problem, is it?
—
Reply to this email directly, view it on GitHub
<#13739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAYWPGGNYLAIPTNWHM2QUHLYQDZA5AVCNFSM6AAAAAA3ILH7WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXHE4DANJWGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@adrianlzt - thanks again for the clarifications. Can we get you to please add a short explanation in README about this behavior? I worry that this is quite different then the experience with other plugins or other output documentation and I do not want users to get caught off guard. Something to the effect of:
It is possible that a user of Zabbix already knows this, but it would still be good to have this written down so we can reference it. After that and renaming key config option I believe we are good to go. Thanks! |
Added in 335decf |
335decf
to
9f55b47
Compare
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for driving this to completion!
6 years! Time to celebrate 🥳 |
@adrianlzt thanks for your persistence and patience! |
Add output plugin to support sending metrics to Zabbix (https://www.zabbix.com/).
This output plugin handle sending metrics as traps, generating LLD data to feed discovery rules and is able to send autoregistration requests.
Required for all PRs
Supersedes PR #3966