-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cuckoo-based drop cache #567
Conversation
collect/cache/cuckoo.go
Outdated
if c.firstTime && dropFactor > 0.5 { | ||
c.mut.Lock() | ||
defer c.mut.Unlock() | ||
c.firstTime = false | ||
} | ||
// if the current one is full, cycle the filters | ||
if dropFactor > 0.99 { | ||
c.mut.Lock() | ||
defer c.mut.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this result in a deadlock? We could acquire a lock on line 56, defer unlock on line 57 and then try to acquire another lock on line 62. Will the lock be released as part of the if statement scope closing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa -- good catch. The lock will NOT be released and this can cause a deadlock. Will fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest update, please re-review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also needs PR title updating 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
## Which problem is this PR solving? Before this change, Refinery used a circular LRU cache to retain a record for every trace; this cache is hardcoded to 5x the configured cache size, and does not change when the cache is resized. This is a relatively small number, and for long-lived traces, it might mean that late spans look like new traces, and therefore might get a different sampling decision -- which would result in missing spans in Honeycomb. honeycombio#561 abstracted the SampleCache interface to prepare for other implementations. This uses it to provide a new cache design. ## Short description of the changes This design implements a "cuckoo filter" cache for dropped traces, which can store the dropped trace information much more efficiently (about 4 bytes per trace as compared to about 200 bytes for kept traces). - Adds a CuckooTraceChecker type that implements a 2-stage cuckoo filter for tracking recently-used trace IDs over time. - Implements the SampleCache interface with a CuckooSentCache, which uses the existing LRU for kept traces, and a CuckooTraceChecker for dropped traces. - Implements a new configuration block for caches to allow users to opt into the cuckoo cache and control it for their needs, but is still backwards compatible. - Adds documentation to the config_complete file. - Adds additional metrics for tracking the cuckoo cache size
Which problem is this PR solving?
Currently, Refinery uses a circular LRU cache to retain a record for every trace; this cache is hardcoded to 5x the configured cache size, and does not change when the cache is resized.
This is a relatively small number, and for long-lived traces, it might mean that late spans look like new traces, and therefore might get a different sampling decision -- which would result in missing spans in Honeycomb.
#561 abstracted the SampleCache interface to prepare for other implementations. This uses it to provide a new cache design.
Short description of the changes
This design implements a "cuckoo filter" cache for dropped traces, which can store the dropped trace information much more efficiently (about 4 bytes per trace as compared to about 200 bytes for kept traces).