-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] Fix reference leak in TCP and Unix socket inputs #19459
[Filebeat] Fix reference leak in TCP and Unix socket inputs #19459
Conversation
86c05b5
to
11d959c
Compare
Pinging @elastic/integrations-services (Team:Services) |
Good find. Please backport to 7.8 as well before FF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I consider context and WaitGroup quite low level primitives. They require you to 'track' usage and even free resources (cancel functon MUST be called). We've got some helpers here: https://pkg.go.dev/github.com/elastic/go-concert@v0.0.3/ctxtool?tab=doc, and some kind of with cancelation here: https://pkg.go.dev/github.com/elastic/go-concert?tab=doc#OnceSignaler
The context in Listener could be replaced with OnceSignaler and the stop could be replaced handled via 'WithFunc'.
For higher level primitives managing ownership of groups of go-routines check out the Group interface here: https://pkg.go.dev/github.com/elastic/go-concert@v0.0.3/unison?tab=doc#Group
We've got a sample implementation with TaskGroup
. If you keep StopOnError false
and do not return an error the implementation perfectly fits this use-case. starting the handler routine would look like this:
l.handlerGroup.Go(func(canceler unison.Canceler) error {
ctx, cancel := ctxtool.WithFunc(ctxtool.FromCanceller(canceler), func() {
conn.Close()
})
defer cancel()
defer logp.Recover(...)
...
return nil
})
The group combines shutdown signaling and the WaitGroup when calling 'Stop'. Once 'Stop' has been initiated any attempt to start another handler via Go
will return an error, because we are about to shutdown.
go func() { | ||
<-ctx.Done() | ||
conn.Close() | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The handler go-routine is started always. Let's move cancellation setup local to the that go-routine. Consider to use ctxtool.WithFunc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I updated it to use ctxtool.WithFunc
.
The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map.
The inputsource/common.Closer was managing bidirectional links between parents and children. Anytime you closed an instance it would close all of its children and also remove itself from its parents list of children (this is where the bug was). Every instance has its own mutex. While recursively closing children it was easy to run into a deadlock because the parent holds a lock while closing its children and then the child must edit the parent to remove itself so it also tries to acquire the parent lock. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code.
96a4155
to
b524da0
Compare
fyi @vjsamuel this might affect you? |
Thanks. I will need to redo our internal input during the next rebase. Moving to context is the right thing to do however. |
…ne-beats * upstream/master: (105 commits) ci: enable packaging job (elastic#19536) ci: disable upstream trigger on PRs for the packaging job (elastic#19490) Implement memlog on-disk handling (elastic#19408) fix go.mod for PR elastic#19423 (elastic#19521) [MetricBeat] add param `aws_partition` to support aws-cn, aws-us-gov regions (elastic#19423) Input v2 stateless manager (elastic#19406) Input v2 compatibility layer (elastic#19401) [Elastic Agent] Fix artifact downloading to allow endpoint-security to be downloaded (elastic#19503) fix: ignore target changes on scans (elastic#19510) Add more helpers to pipeline/testing package (elastic#19405) Report dependencies in CSV format (elastic#19506) [Filebeat] Fix reference leak in TCP and Unix socket inputs (elastic#19459) Cursor input skeleton (elastic#19378) Add changelog. (elastic#19495) [DOC] Typo in Kerberos (elastic#19265) Remove accidentally commited unused NOTICE template (elastic#19485) [Elastic Agent] Support the install, control, and uninstall of Endpoint (elastic#19248) [Filebeat][httpjson] Add split_events_by config setting (elastic#19246) ci: disabling packaging job until we fix it (elastic#19481) Fix golang.org/x/tools to release1.13 (elastic#19478) ...
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…19501) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…nix socket inputs (#19502) * [Filebeat] Fix reference leak in TCP and Unix socket inputs (#19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846) * Update vendor and notice
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code.
…P and Unix socket inputs (elastic#19502) * [Filebeat] Fix reference leak in TCP and Unix socket inputs (elastic#19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 8eb8ed7) * Update vendor and notice
What does this PR do?
The tcp and unix input sources were leaking references causing a memory leak.
When an accepted connection ended inputsource/common.Closer was
supposed to delete the pointer that it held to the connection, but due to a code
error
delete
was being called on the wrong map.Why is it important?
This fixes a memory leak with the tcp and unix inputs.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.