Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache ejects traces too soon #545

Closed
kentquirk opened this issue Oct 27, 2022 · 2 comments
Closed

Cache ejects traces too soon #545

kentquirk opened this issue Oct 27, 2022 · 2 comments
Labels
type: bug Something isn't working type: enhancement New feature or request
Milestone

Comments

@kentquirk
Copy link
Contributor

Problem

The trace cache keeps a circular buffer to preserve insertion order, and ejects traces when it overwrites a cache entry in the circular buffer.

However, when a trace is sent because its root span has arrived, the spot in the circular buffer is just replaced with nil. This means that even though there may be plenty of space (nil slots) in the cache, a trace is ejected automatically whenever the buffer has wrapped around

Example: the buffer has 100 slots. The first trace arrives, and takes a long time to complete. Then 99 fast traces arrive and are completed and sent. The 101st trace to arrive will cause the first to be ejected, even though there are 99 empty slots in the buffer.

This isn't necessarily incorrect behavior, but in systems under heavy load where some traces may take a long time, those traces may get ejected sooner than would otherwise be necessary.

This is, I think, reasonable to characterize as a bug, although it's fairly benign so I'm also calling it an enhancement.

Describe the solution you'd like

A more efficient insertion order algorithm (perhaps a priority queue) so that traces can stay in the cache as long as configuration permits without being artificially limited.

Describe alternatives you've considered

Not changing it.

Additional context

Discovered when attempting to create metrics for cache behavior.

@kentquirk kentquirk added type: bug Something isn't working type: enhancement New feature or request labels Oct 27, 2022
@kentquirk
Copy link
Contributor Author

At least one customer has long-lived traces where spans continue to arrive after the trace decision has been made. It would be useful to update the expiration info whenever a late span arrives so the decision is kept around until after spans have stopped arriving for a given trace.

@kentquirk
Copy link
Contributor Author

Fixed by #722

@kentquirk kentquirk added this to the v2.0 milestone Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working type: enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant