You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On systems that have NFS mounts under heavy load, the telegraf process locks up when using the system/disk plugin. Data gathering takes too long.
I have traced the issue to the plugins/system/disk.go code performing some meta command to retrieve disk usage statistics for presumably ALL filesystems first, and then applying the pruning on the retrieved data.
That's precisely the wrong order of operations in order to provide for defensive application of telegraf on loaded systems. I'm no Golang expert, but a quick googling reveals many ways to query file system statistics for a specific mount point.
for each filesystem in the list:
2.1 if the filesystem path isn't in the mountpath list from config (if it exists)
2.1.1 get information from this filesystem and append to a list of results with correct tags etc.
for each item in the list of results:
3.1 add the items to the accumulator object
return the accumulator object
The text was updated successfully, but these errors were encountered:
On systems that have NFS mounts under heavy load, the telegraf process locks up when using the system/disk plugin. Data gathering takes too long.
I have traced the issue to the plugins/system/disk.go code performing some meta command to retrieve disk usage statistics for presumably ALL filesystems first, and then applying the pruning on the retrieved data.
That's precisely the wrong order of operations in order to provide for defensive application of telegraf on loaded systems. I'm no Golang expert, but a quick googling reveals many ways to query file system statistics for a specific mount point.
So I propose instead that the algorithm be updated a bit at https://github.com/influxdb/telegraf/blob/master/plugins/system/disk.go#L30:
2.1 if the filesystem path isn't in the mountpath list from config (if it exists)
2.1.1 get information from this filesystem and append to a list of results with correct tags etc.
3.1 add the items to the accumulator object
The text was updated successfully, but these errors were encountered: