Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race when initializing ringBufferRateLimiter #43

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

inahga
Copy link
Contributor

@inahga inahga commented Mar 6, 2024

This closes #36.

Fixes a race condition between ringBufferRateLimiter creation and its insertion into a map. Do this by locking the entire map when we get or insert a ringBufferRateLimiter.

I have replaced use of sync.Map with a normal map[string]*ringBufferRateLimiter and a sync.Mutex. They are passed around with a rateLimitersMap struct. I've factored out logic into methods of rateLimitersMap, which enables some careful use of defer rlm.mu.Unlock() to avoid leaving a lock held open on panic().

We didn't see a need for a sync.Map. The docs suggest against using it for type safety, and none of the suggested use cases apply. https://pkg.go.dev/sync#Map. Let me know if I'm misunderstanding the use case (very possible!).

I've removed the sync.Pool, for now. Since ringBufferRateLimiter creation and insertion is fully synchronized, I didn't see a need for it.

Note that some of the defensive refactoring is not strictly required--I have a change that preserves the existing data structures, but I think the suggested changeset is an overall improvement in maintainability. divviup@65ad951.

Some discussion of the performance impact and profiles is here divviup#1. TL;DR, no meaningful impact to CPU, memory, or latency. This implementation could be optimized by replacing the normal mutex with a RWMutex, but it would be a marginal improvement (if any) in exchange for much more complicated locking semantics.

I suggest waiting before merging this, while we let it sit in production for a while longer, to ensure the problem is fully fixed. I'll let it sit for about a week. In the mean time, this is ready for code review.

This closes mholt#36.

Fixes a race condition between ringBufferRateLimiter creation and its insertion
into a map. Do this by locking the entire map when we get or insert a
ringBufferRateLimiter.

I have replaced use of sync.Map with a normal  `map[string]*ringBufferRateLimiter`
and a `sync.Mutex`. They are passed around with a `rateLimitersMap` struct.
I've factored out logic into methods of rateLimitersMap, which enables some
careful use of defer `rlm.mu.Unlock()`` to avoid leaving a lock held open
on `panic()`.

We didn't see a need for a sync.Map. The docs suggest against using it for
type safety, and none of the suggested use cases apply. https://pkg.go.dev/sync#Map.
Let me know if I'm misunderstanding the use case (very possible!).

I've removed the sync.Pool, for now. Since ringBufferRateLimiter creation
and insertion is fully synchronized, I didn't see a need for it.

Note that some of the defensive refactoring is not strictly required--I have
a change that preserves the existing data structures, but I think the
suggested changeset is an overall improvement in maintainability.
65ad951.

Some discussion of the performance impact and profiles is here
#1.
TL;DR, no meaningful impact to CPU, memory, or latency. This implementation could
be optimized by replacing the normal mutex with a RWMutex, but it would be
a marginal improvement (if any) in exchange for much more complicated locking
semantics.
@mholt
Copy link
Owner

mholt commented Mar 10, 2024

Hi, wow!! Thank you for this. Sorry for the lateness, I was out this weekend with the family. I'll look at this soon!

@DzigaV
Copy link

DzigaV commented Mar 13, 2024

How have your instances been this last week, @inahga?

I'll sleep better once this is merged 🙂

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch -- especially on a tricky bug!

We'll give this a shot, it LGTM after a once-over!

@mholt mholt merged commit 359636d into mholt:master Mar 13, 2024
3 checks passed
@inahga
Copy link
Contributor Author

inahga commented Mar 13, 2024

How have your instances been this last week, @inahga?

I'll let you know by Monday, I was slow to deploy it this week.

@inahga
Copy link
Contributor Author

inahga commented Mar 19, 2024

@DzigaV We've run this in production for a week now, with no signs of leakage. Feeling pretty confident this fixed it.

@mholt
Copy link
Owner

mholt commented Mar 20, 2024

That's great news!! Thank you for the contribution and the work required to get it figured out. Glad it's working 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stuck mutex causing resource leak
3 participants