Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If stress relief is activated by peers, it will never deactivate #688

Closed
kentquirk opened this issue May 18, 2023 · 0 comments · Fixed by #698
Closed

If stress relief is activated by peers, it will never deactivate #688

kentquirk opened this issue May 18, 2023 · 0 comments · Fixed by #698
Labels
type: bug Something isn't working
Milestone

Comments

@kentquirk
Copy link
Contributor

Steps to reproduce

  1. Set the peer queue size too small; cause stress relief to trigger because the peer queue fills up. When stress relief activates, it stops sending traffic to peers, and thus the metric will never update and the stress indicator never resets.

That metric is controlled by libhoney. Options are to periodically send something to peers, or to change how that metric updates in libhoney.

@kentquirk kentquirk added the type: bug Something isn't working label May 18, 2023
@kentquirk kentquirk added this to the v2.0 milestone May 18, 2023
kentquirk added a commit that referenced this issue May 19, 2023
## Which problem is this PR solving?

- #688 

When stress relief activates, it stops sending peer traffic. But the
metrics that update stress relief based on peer volume only update when
there's peer traffic.

This causes any trace that refinery keeps (forwards to Honeycomb) in
stress relief mode to *also* be sent as a probe to the appropriate peer
(which then drops it). This ensures that a little bit of traffic also
gets to peers, which keeps the system operating normally and updating
metrics.

## Short description of the changes

- Make a copy of the incoming event as a probe whenever stress relief is
active and would keep a span; if the probe would have been sent to a
peer, send it.
- When a probe is received from a peer, drop it without further
examination.
- Log when this happens on both sides.
- Add the stress relief reason to the `meta.refinery.reason` field for
kept spans so that it's not only found in the logs.

---------

Co-authored-by: Alyson van Hardenberg <avanhardenberg@honeycomb.io>
kentquirk added a commit that referenced this issue Jun 2, 2023
## Which problem is this PR solving?

- We had made some fixes on a branch for internal deployments. This
pulls them over without closing that branch.

## Short description of the changes

- Sends a small amount of data to peers during stress relief; Closes
#688
- Adds stress relief reason to the meta.refinery.reason when stress
relief is activated
- Fixes a bug where json that got unmarshaled as []byte would panic

---------

Co-authored-by: Alyson van Hardenberg <avanhardenberg@honeycomb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant