-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: streaming seems to be adding a lag to the raw data timestamps for my Prometheus charts #14318
Comments
end result here is that if i look at the local dashboard of the child node itself i see the drops at ~06:16: but if i look via the parent (which is what NC will do) i see its all shifted and the first drop now is at ~06.23 according to the parent which is wrong. Note also: the values between the two nodes even after the shift is accounted for are also not identical - they are similar but not identical so for some reason its all been shifted on the parent and the actual values on the parent are not that of the child - they are somewhat close - hence which two shifted lines look similar - but not identical in actual values when you look at the api - so thats potentially a big issue here too. |
@andrewm4894 this issue may be fixed in #14319. Can you please re-test? |
will do |
@ilyam8 this does not look to be fixed from what i can see: left = looking via the parent Same timeframes and charts but can see they look very different. The size of the lag and difference in the numbers looks to be even more confusing now for some reason. Will try explore more to see if can find anything else maybe useful.
|
Note: it just seem to be affecting the Prometheus charts i have. All the other typical charts seem in line. Even other collectors i have enabled like alarms, pandas, and zscores all also seem to line up so just looks like the prom stuff i have configured like this:
In case that might narrow it down. |
@andrewm4894 it is been 3 dbengine2 improvements PRs since then, please re-test when u have time. |
just recreated now - will let things run for a while and then test again. |
is the problem solved? |
@ilyam8 i don't seem to even see the data via the parent now for some reason. going to try investigate more |
@andrewm4894 I've the same issue, streams to parent shows no metrics anymore |
@thefiredragon doesn't look the same to me. What is your Netdata version? What is children nodes memory mode? |
(i'll retest today on my end on latest versions) |
@ilyam8 sorry, my issue was not related to this issue, someone had disabled my firewall rules, so devices had only connected to the registry and stream goes into somewhere. |
Closing as it appears to be fixed (#14318 (comment)). @andrewm4894 re-open if there are any problems. Also, I don't see it being prometheus specific. The only thing that distinguishes prometheus from any other collector is (default) update_every: it is 10 vs 1. |
@ilyam8 cool - i'll do one last recheck on my end and re-open if still seeing it - was looking ok after a few hours yesterday so will just do one more check to be sure. |
Bug description
related internal slack: https://netdata-cloud.slack.com/archives/CS3PB0VJ7/p1674554617334889
response from my child's nodes actual data:
response from that node via the parent:
two problems:
A. these seems to be a lag on the childs data in the parent.
B. the actual values are all also slightly different for some reason.
Expected behavior
child data should be accurate on the parent
for example at
1674540996
the value drops from0.1386571
to0
but on the parent version of this data the drop does not happen - instead it happens about 6mins later.1674541380 - 1674540995 = 385 seconds later
also you can see from above that the actual raw values themselves are all a little different for some reason.
Steps to reproduce
unsure how reprobucilbe this is tbh
...
Installation method
kickstart.sh
System info
Netdata build info
Additional info
No response
The text was updated successfully, but these errors were encountered: