-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix zero-filled and wrongly emitted metrics #710
Conversation
This should fix #708
Here's a specific example where I'm not sure if we should "fix" the old behavior. That's the metric for |
Codecov Report
@@ Coverage Diff @@
## master #710 +/- ##
==========================================
- Coverage 64.39% 64.39% -0.01%
==========================================
Files 101 101
Lines 8277 8302 +25
==========================================
+ Hits 5330 5346 +16
- Misses 2599 2608 +9
Partials 348 348
Continue to review full report at Codecov.
|
In my opinion, it is ok to lose a single iteration metrics. |
Maybe, but it won't be just a single iteration, it will be one iteration for every VU scaled down, so anywhere from 0 to hundreds. Also, imagine that when a VU's context is canceled in the middle of its iteration due to scaling the number of VUs down, some of the HTTP requests in that iteration have already been completed and have their metrics sent to the engine. So we might want to send at least |
Yeah, but hundreds from many thousands is not that much. I fear that to accomplish not losing a single metric, will lead to another huge refactoring. |
No, in this case the fix is very simple, for metrics we want to preserve even when a VU's context is canceled, I just leave them the old way instead of using the helper function I did in this PR. That is, revert to using |
We should definitely not loose data on the number of requests sent, as well as data sent and received as we use those in the cloud execution functionality (with IP address info) to track what systems we're hitting with traffic and how much, for abuse prevention and audit trails. |
Hmm after thinking about this a bit more, I realized that if we send the |
While finishing this up and writing the test, I realized that with the real-time metrics I'd also inadvertently reverted the decision we made in #652... That is, now even unfinished iterations emit an |
Slight correction to the previous statement. Things works as expected when |
With the latest commit this should hopefully be done. I'll take another look tomorrow just in case, and at least another pair of eyes would be very helpful, but I think that I've fixed all of the issues that I know of. And not only fixed the bugs, but with this patch the metrics emission when scaling down VUs would actually be a lot better than before we had real-time metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The fix is not to emit metric samples when a VU's context is canceled. This should fix #708, but it still needs unit tests and more checks in general. For example, are there some metrics we actually want to emit even from VUs that are canceled?