Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in loki.process on config update #7004

Merged
merged 3 commits into from
Aug 29, 2024
Merged

Conversation

ptodev
Copy link
Contributor

@ptodev ptodev commented Aug 20, 2024

PR Description

A goroutine wasn't being stopped in Update, which resulted in a memory leak.

The same change was also introduced to Alloy.

PR Checklist

  • CHANGELOG.md updated
  • Documentation added
  • Tests updated
  • Config converters updated

@ptodev
Copy link
Contributor Author

ptodev commented Aug 22, 2024

Hi @thampiotr, would you mind reviewing the latest commit please? The unit tests kept being flaky, and I realised it's because the loki.process code needs to be improved further to shut down more gracefully. If you're happy with this change, I'll also open a PR for Alloy to have it.

@@ -85,17 +85,24 @@ func New(o component.Options, args Arguments) (*Component, error) {

// Run implements component.Component.
func (c *Component) Run(ctx context.Context) error {
shutdownCh := make(chan struct{})
wgOut := &sync.WaitGroup{}
defer func() {
c.mut.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe and desired to wgOut.Wait() while holding this lock? Probably out of scope here, but I think we may have a goroutine waiting for c.mut in Update that would still create resources as Run is shutting down.

Happy to leave it for now, as we have other, more frequently occurring problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the issue you're referring to can be avoided by adding a check in Update right after c.mut.Lock() - if the context got canceled, the function could return? That sounds fair, but I think wgOut has to stay regardless? It's Update's responsibility to make sure it doesn't do any updates if the component is shutting down.

Copy link
Contributor

@thampiotr thampiotr Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fair, but I think wgOut has to stay regardless?

Leaving it would be part of some solutions to this problem, yes. Not sure it must stay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that this situation could result in a leak, but in practice I don't think it'd be a problem. The only time when Run exists is when the process exits. Hence, a leak wouldn't have any impact. I suppose other components might have a similar issues too btw.

@ptodev ptodev merged commit b74f174 into main Aug 29, 2024
10 checks passed
@ptodev ptodev deleted the ptodev/fix-loki-mem-leak branch August 29, 2024 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants